* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests [not found] ` <20120126144632.GM21211@redhat.com> @ 2012-01-26 15:18 ` Eric Blake 2012-01-26 19:35 ` Luiz Capitulino ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Eric Blake @ 2012-01-26 15:18 UTC (permalink / raw) To: Daniel P. Berrange; +Cc: libvir-list, Michal Privoznik, QEMU Developers [-- Attachment #1: Type: text/plain, Size: 3126 bytes --] [adding qemu-devel] On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >> One thing, that you'll probably notice is this >> 'set-support-level' command. Basically, it tells GA what qemu version >> is it running on. Ideally, this should be done as soon as >> GA starts up. However, that cannot be determined from outside >> world as GA doesn't emit any events yet. >> Ideally^2 this command should be left out as it should be qemu >> who tells its own agent this kind of information. >> Anyway, I was going to call this command in qemuProcess{Startup, >> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >> so guest can boot and start GA, but that implies returning from qemuProcess*. >> >> So I am setting this just before 'guest-suspend' command, as >> there is one more thing about GA. It is unable to remember anything >> upon its restart (GA process). Which has BTW show flaw >> in our current code with FS freeze & thaw. If we freeze guest >> FS, and somebody restart GA, the simple FS Thaw will not succeed as >> GA thinks FS are not frozen. But that's a different cup of tea. >> >> Because of what written above, we need to call set-level >> on every suspend. > > > IMHO all this says that the 'set-level' command is a conceptually > unfixably broken design & should be killed in QEMU before it turns > into an even bigger mess. > > Once we're in a situation where we need to call 'set-level' prior > to every single invocation, you might as well just allow the QEMU > version number to be passed in directly as an arg to the command > you are running directly thus avoiding this horrificness. Qemu folks, would you care to chime in on this? Exactly how is the set-level command supposed to work? As I understand it, the goal is that if the guest has qemu-ga 1.1 installed, but is being run by qemu 1.0, then we want to ensure that any guest agent command supported by qemu-ga 1.1 but requiring features of qemu not present in qemu 1.0 will be properly rejected. But whose job is it to tell the guest agent what version of qemu is running? Based on the above conversation, it looks like the current qemu implementation does not do any handshaking on its own when the guest agent first comes alive, which means that you are forcing the work on the management app (libvirt). And this is inherently racy - if the guest is allowed to restart its qemu-ga process at will, and each restart of that guest process triggers a need to redo the handshake, then libvirt can never reliably know what version the agent is running at. I think we really do need a mode where as soon as the qemu-ga process exists, it sends an event, then the qemu side of the virtio bus responds to that event by telling the agent what version of qemu is talking to the agent, all prior to exposing any agent commands out to management apps, thus making the qemu-ga set-level command an automatic part of the handshake, and invisible to management apps. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 15:18 ` [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests Eric Blake @ 2012-01-26 19:35 ` Luiz Capitulino 2012-01-26 19:41 ` Michal Privoznik 2012-01-26 22:57 ` Anthony Liguori 2012-01-26 22:54 ` Anthony Liguori 2012-01-27 0:01 ` Michael Roth 2 siblings, 2 replies; 18+ messages in thread From: Luiz Capitulino @ 2012-01-26 19:35 UTC (permalink / raw) To: Eric Blake; +Cc: libvir-list, Michal Privoznik, QEMU Developers, mdroth On Thu, 26 Jan 2012 08:18:03 -0700 Eric Blake <eblake@redhat.com> wrote: > [adding qemu-devel] > > On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: > >> One thing, that you'll probably notice is this > >> 'set-support-level' command. Basically, it tells GA what qemu version > >> is it running on. Ideally, this should be done as soon as > >> GA starts up. However, that cannot be determined from outside > >> world as GA doesn't emit any events yet. > >> Ideally^2 this command should be left out as it should be qemu > >> who tells its own agent this kind of information. > >> Anyway, I was going to call this command in qemuProcess{Startup, > >> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > >> so guest can boot and start GA, but that implies returning from qemuProcess*. > >> > >> So I am setting this just before 'guest-suspend' command, as > >> there is one more thing about GA. It is unable to remember anything > >> upon its restart (GA process). Which has BTW show flaw > >> in our current code with FS freeze & thaw. If we freeze guest > >> FS, and somebody restart GA, the simple FS Thaw will not succeed as > >> GA thinks FS are not frozen. But that's a different cup of tea. > >> > >> Because of what written above, we need to call set-level > >> on every suspend. > > > > > > IMHO all this says that the 'set-level' command is a conceptually > > unfixably broken design & should be killed in QEMU before it turns > > into an even bigger mess. Can you elaborate on this? Michal and I talked on irc about making the compatibility level persistent, would that help? > > Once we're in a situation where we need to call 'set-level' prior > > to every single invocation, you might as well just allow the QEMU > > version number to be passed in directly as an arg to the command > > you are running directly thus avoiding this horrificness. > > Qemu folks, would you care to chime in on this? > > Exactly how is the set-level command supposed to work? As I understand > it, the goal is that if the guest has qemu-ga 1.1 installed, but is > being run by qemu 1.0, then we want to ensure that any guest agent > command supported by qemu-ga 1.1 but requiring features of qemu not > present in qemu 1.0 will be properly rejected. Not exactly, the default support of qemu-ga is qemu 1.0. This means that by default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This way the set-support-level command allows you to specify that qemu 2.0 features are supported. Note that this is only about specific features that depend on host support, like S3 suspend which is known to be buggy in current and old qemu. > But whose job is it to tell the guest agent what version of qemu is > running? Based on the above conversation, it looks like the current > qemu implementation does not do any handshaking on its own when the > guest agent first comes alive, which means that you are forcing the work > on the management app (libvirt). And this is inherently racy - if the > guest is allowed to restart its qemu-ga process at will, and each > restart of that guest process triggers a need to redo the handshake, > then libvirt can never reliably know what version the agent is running at. Making the set-support-level persistent would solve it, wouldn't it? > > I think we really do need a mode where as soon as the qemu-ga process > exists, it sends an event, then the qemu side of the virtio bus responds > to that event by telling the agent what version of qemu is talking to > the agent, all prior to exposing any agent commands out to management > apps, thus making the qemu-ga set-level command an automatic part of the > handshake, and invisible to management apps. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 19:35 ` Luiz Capitulino @ 2012-01-26 19:41 ` Michal Privoznik 2012-01-26 20:13 ` Luiz Capitulino 2012-01-26 22:57 ` Anthony Liguori 1 sibling, 1 reply; 18+ messages in thread From: Michal Privoznik @ 2012-01-26 19:41 UTC (permalink / raw) To: Luiz Capitulino; +Cc: libvir-list, Eric Blake, QEMU Developers, mdroth On 26.01.2012 20:35, Luiz Capitulino wrote: > On Thu, 26 Jan 2012 08:18:03 -0700 > Eric Blake <eblake@redhat.com> wrote: > >> [adding qemu-devel] >> >> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>> One thing, that you'll probably notice is this >>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>> is it running on. Ideally, this should be done as soon as >>>> GA starts up. However, that cannot be determined from outside >>>> world as GA doesn't emit any events yet. >>>> Ideally^2 this command should be left out as it should be qemu >>>> who tells its own agent this kind of information. >>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>> >>>> So I am setting this just before 'guest-suspend' command, as >>>> there is one more thing about GA. It is unable to remember anything >>>> upon its restart (GA process). Which has BTW show flaw >>>> in our current code with FS freeze & thaw. If we freeze guest >>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>> >>>> Because of what written above, we need to call set-level >>>> on every suspend. >>> >>> >>> IMHO all this says that the 'set-level' command is a conceptually >>> unfixably broken design & should be killed in QEMU before it turns >>> into an even bigger mess. > > Can you elaborate on this? Michal and I talked on irc about making the > compatibility level persistent, would that help? > >>> Once we're in a situation where we need to call 'set-level' prior >>> to every single invocation, you might as well just allow the QEMU >>> version number to be passed in directly as an arg to the command >>> you are running directly thus avoiding this horrificness. >> >> Qemu folks, would you care to chime in on this? >> >> Exactly how is the set-level command supposed to work? As I understand >> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >> being run by qemu 1.0, then we want to ensure that any guest agent >> command supported by qemu-ga 1.1 but requiring features of qemu not >> present in qemu 1.0 will be properly rejected. > > Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > way the set-support-level command allows you to specify that qemu 2.0 features > are supported. > > Note that this is only about specific features that depend on host support, > like S3 suspend which is known to be buggy in current and old qemu. > >> But whose job is it to tell the guest agent what version of qemu is >> running? Based on the above conversation, it looks like the current >> qemu implementation does not do any handshaking on its own when the >> guest agent first comes alive, which means that you are forcing the work >> on the management app (libvirt). And this is inherently racy - if the >> guest is allowed to restart its qemu-ga process at will, and each >> restart of that guest process triggers a need to redo the handshake, >> then libvirt can never reliably know what version the agent is running at. > > Making the set-support-level persistent would solve it, wouldn't it? Yes and no. We still need an event when GA come to live. Because if anybody tries to write something for GA which is not running (and for purpose of this scenario assume it never will), like 'set-support-level' and wait for answer (which will never come) he will be blocked indefinitely. However, if he writes it after 1st event come, everything is OK. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 19:41 ` Michal Privoznik @ 2012-01-26 20:13 ` Luiz Capitulino 2012-01-26 22:51 ` Michael Roth 0 siblings, 1 reply; 18+ messages in thread From: Luiz Capitulino @ 2012-01-26 20:13 UTC (permalink / raw) To: Michal Privoznik; +Cc: libvir-list, Eric Blake, QEMU Developers, mdroth On Thu, 26 Jan 2012 20:41:13 +0100 Michal Privoznik <mprivozn@redhat.com> wrote: > On 26.01.2012 20:35, Luiz Capitulino wrote: > > On Thu, 26 Jan 2012 08:18:03 -0700 > > Eric Blake <eblake@redhat.com> wrote: > > > >> [adding qemu-devel] > >> > >> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: > >>>> One thing, that you'll probably notice is this > >>>> 'set-support-level' command. Basically, it tells GA what qemu version > >>>> is it running on. Ideally, this should be done as soon as > >>>> GA starts up. However, that cannot be determined from outside > >>>> world as GA doesn't emit any events yet. > >>>> Ideally^2 this command should be left out as it should be qemu > >>>> who tells its own agent this kind of information. > >>>> Anyway, I was going to call this command in qemuProcess{Startup, > >>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > >>>> so guest can boot and start GA, but that implies returning from qemuProcess*. > >>>> > >>>> So I am setting this just before 'guest-suspend' command, as > >>>> there is one more thing about GA. It is unable to remember anything > >>>> upon its restart (GA process). Which has BTW show flaw > >>>> in our current code with FS freeze & thaw. If we freeze guest > >>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as > >>>> GA thinks FS are not frozen. But that's a different cup of tea. > >>>> > >>>> Because of what written above, we need to call set-level > >>>> on every suspend. > >>> > >>> > >>> IMHO all this says that the 'set-level' command is a conceptually > >>> unfixably broken design & should be killed in QEMU before it turns > >>> into an even bigger mess. > > > > Can you elaborate on this? Michal and I talked on irc about making the > > compatibility level persistent, would that help? > > > >>> Once we're in a situation where we need to call 'set-level' prior > >>> to every single invocation, you might as well just allow the QEMU > >>> version number to be passed in directly as an arg to the command > >>> you are running directly thus avoiding this horrificness. > >> > >> Qemu folks, would you care to chime in on this? > >> > >> Exactly how is the set-level command supposed to work? As I understand > >> it, the goal is that if the guest has qemu-ga 1.1 installed, but is > >> being run by qemu 1.0, then we want to ensure that any guest agent > >> command supported by qemu-ga 1.1 but requiring features of qemu not > >> present in qemu 1.0 will be properly rejected. > > > > Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > > default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > > way the set-support-level command allows you to specify that qemu 2.0 features > > are supported. > > > > Note that this is only about specific features that depend on host support, > > like S3 suspend which is known to be buggy in current and old qemu. > > > >> But whose job is it to tell the guest agent what version of qemu is > >> running? Based on the above conversation, it looks like the current > >> qemu implementation does not do any handshaking on its own when the > >> guest agent first comes alive, which means that you are forcing the work > >> on the management app (libvirt). And this is inherently racy - if the > >> guest is allowed to restart its qemu-ga process at will, and each > >> restart of that guest process triggers a need to redo the handshake, > >> then libvirt can never reliably know what version the agent is running at. > > > > Making the set-support-level persistent would solve it, wouldn't it? > > Yes and no. We still need an event when GA come to live. Because if > anybody tries to write something for GA which is not running (and for > purpose of this scenario assume it never will), like 'set-support-level' > and wait for answer (which will never come) he will be blocked > indefinitely. However, if he writes it after 1st event come, everything > is OK. What if the event never reach libvirt? This problem is a lot more general and is not related to the set-support-level command. Maybe adding shutdown & start events can serve as good hints, but they won't fix the problem. IMHO, the best way to solve this is to issue the guest-sync command with a timeout. If you get no answer, then try again later. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 20:13 ` Luiz Capitulino @ 2012-01-26 22:51 ` Michael Roth 0 siblings, 0 replies; 18+ messages in thread From: Michael Roth @ 2012-01-26 22:51 UTC (permalink / raw) To: Luiz Capitulino Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers On 01/26/2012 02:13 PM, Luiz Capitulino wrote: > On Thu, 26 Jan 2012 20:41:13 +0100 > Michal Privoznik<mprivozn@redhat.com> wrote: > >> On 26.01.2012 20:35, Luiz Capitulino wrote: >>> On Thu, 26 Jan 2012 08:18:03 -0700 >>> Eric Blake<eblake@redhat.com> wrote: >>> >>>> [adding qemu-devel] >>>> >>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>>>> One thing, that you'll probably notice is this >>>>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>>>> is it running on. Ideally, this should be done as soon as >>>>>> GA starts up. However, that cannot be determined from outside >>>>>> world as GA doesn't emit any events yet. >>>>>> Ideally^2 this command should be left out as it should be qemu >>>>>> who tells its own agent this kind of information. >>>>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>>>> >>>>>> So I am setting this just before 'guest-suspend' command, as >>>>>> there is one more thing about GA. It is unable to remember anything >>>>>> upon its restart (GA process). Which has BTW show flaw >>>>>> in our current code with FS freeze& thaw. If we freeze guest >>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>>>> >>>>>> Because of what written above, we need to call set-level >>>>>> on every suspend. >>>>> >>>>> >>>>> IMHO all this says that the 'set-level' command is a conceptually >>>>> unfixably broken design& should be killed in QEMU before it turns >>>>> into an even bigger mess. >>> >>> Can you elaborate on this? Michal and I talked on irc about making the >>> compatibility level persistent, would that help? >>> >>>>> Once we're in a situation where we need to call 'set-level' prior >>>>> to every single invocation, you might as well just allow the QEMU >>>>> version number to be passed in directly as an arg to the command >>>>> you are running directly thus avoiding this horrificness. >>>> >>>> Qemu folks, would you care to chime in on this? >>>> >>>> Exactly how is the set-level command supposed to work? As I understand >>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >>>> being run by qemu 1.0, then we want to ensure that any guest agent >>>> command supported by qemu-ga 1.1 but requiring features of qemu not >>>> present in qemu 1.0 will be properly rejected. >>> >>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by >>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This >>> way the set-support-level command allows you to specify that qemu 2.0 features >>> are supported. >>> >>> Note that this is only about specific features that depend on host support, >>> like S3 suspend which is known to be buggy in current and old qemu. >>> >>>> But whose job is it to tell the guest agent what version of qemu is >>>> running? Based on the above conversation, it looks like the current >>>> qemu implementation does not do any handshaking on its own when the >>>> guest agent first comes alive, which means that you are forcing the work >>>> on the management app (libvirt). And this is inherently racy - if the >>>> guest is allowed to restart its qemu-ga process at will, and each >>>> restart of that guest process triggers a need to redo the handshake, >>>> then libvirt can never reliably know what version the agent is running at. >>> >>> Making the set-support-level persistent would solve it, wouldn't it? >> >> Yes and no. We still need an event when GA come to live. Because if >> anybody tries to write something for GA which is not running (and for >> purpose of this scenario assume it never will), like 'set-support-level' >> and wait for answer (which will never come) he will be blocked >> indefinitely. However, if he writes it after 1st event come, everything >> is OK. > > What if the event never reach libvirt? > > This problem is a lot more general and is not related to the > set-support-level command. Maybe adding shutdown& start events can serve as > good hints, but they won't fix the problem. Yah, start up events are a good indicator to issue the guest-sync sequence (we had them at one point, and planned to re-add them for QMP integration, and since libvirt is taking on this role for now it might make sense to re-add it now), but once that sequence is issued the agent can still be manually stopped, or the guest-sync sequence itself can timeout. And there's no way to reliably send a stop indicator, maybe to capture shutdown events, but not consistently enough that we can base the protocol on it (agent might get pkill -9'd for instance, and virtio-serial doesn't currently plumb a guest-side disconnect to the chardev front-end, so you'd never know). So, the only indication you'll ever get that your "session" ended is either a timeout, or, if we add it, a start up event. In either case the response is to issue the reset sequence. The way it would need to work with resets is everytime a command times out you: 1) report the timeout error to libvirt client/management app. set guest-agent_available = 0, such that further libvirt calls that depend on it would return "not currently available", or something to that effect. 2) issue guest-sync with new unique session id 3) read a json object/response. - if you time out, goto 2 - if your response doesn't have the session id you're expecting, repeat 3) (since it may be a response to a previous guest-sync RPC that you timed out on prematurely, but you can't just wait indefinitely, since it may never arrive) 4) set guest_agent_available = 1, proceed with normal operation till the next timeout (or start up event, if we add one). > > IMHO, the best way to solve this is to issue the guest-sync command with > a timeout. If you get no answer, then try again later. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 19:35 ` Luiz Capitulino 2012-01-26 19:41 ` Michal Privoznik @ 2012-01-26 22:57 ` Anthony Liguori 2012-01-30 12:57 ` Luiz Capitulino 1 sibling, 1 reply; 18+ messages in thread From: Anthony Liguori @ 2012-01-26 22:57 UTC (permalink / raw) To: Luiz Capitulino Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers, mdroth On 01/26/2012 01:35 PM, Luiz Capitulino wrote: > On Thu, 26 Jan 2012 08:18:03 -0700 > Eric Blake<eblake@redhat.com> wrote: > >> [adding qemu-devel] >> >> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>> One thing, that you'll probably notice is this >>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>> is it running on. Ideally, this should be done as soon as >>>> GA starts up. However, that cannot be determined from outside >>>> world as GA doesn't emit any events yet. >>>> Ideally^2 this command should be left out as it should be qemu >>>> who tells its own agent this kind of information. >>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>> >>>> So I am setting this just before 'guest-suspend' command, as >>>> there is one more thing about GA. It is unable to remember anything >>>> upon its restart (GA process). Which has BTW show flaw >>>> in our current code with FS freeze& thaw. If we freeze guest >>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>> >>>> Because of what written above, we need to call set-level >>>> on every suspend. >>> >>> >>> IMHO all this says that the 'set-level' command is a conceptually >>> unfixably broken design& should be killed in QEMU before it turns >>> into an even bigger mess. > > Can you elaborate on this? Michal and I talked on irc about making the > compatibility level persistent, would that help? > >>> Once we're in a situation where we need to call 'set-level' prior >>> to every single invocation, you might as well just allow the QEMU >>> version number to be passed in directly as an arg to the command >>> you are running directly thus avoiding this horrificness. >> >> Qemu folks, would you care to chime in on this? >> >> Exactly how is the set-level command supposed to work? As I understand >> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >> being run by qemu 1.0, then we want to ensure that any guest agent >> command supported by qemu-ga 1.1 but requiring features of qemu not >> present in qemu 1.0 will be properly rejected. > > Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > way the set-support-level command allows you to specify that qemu 2.0 features > are supported. Version numbers are meaningless. What happens when a bunch of features get backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of 2.0? The feature negotiation mechanism we have in QMP is the existence of a command. If we're in a position where we're trying to disable part of a command, it simply means that we should have multiple commands such that we can just remove the disabled part entirely. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 22:57 ` Anthony Liguori @ 2012-01-30 12:57 ` Luiz Capitulino 2012-01-30 13:54 ` Anthony Liguori 2012-01-30 15:03 ` Michael Roth 0 siblings, 2 replies; 18+ messages in thread From: Luiz Capitulino @ 2012-01-30 12:57 UTC (permalink / raw) To: Anthony Liguori Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers, mdroth On Thu, 26 Jan 2012 16:57:01 -0600 Anthony Liguori <anthony@codemonkey.ws> wrote: > On 01/26/2012 01:35 PM, Luiz Capitulino wrote: > > On Thu, 26 Jan 2012 08:18:03 -0700 > > Eric Blake<eblake@redhat.com> wrote: > > > >> [adding qemu-devel] > >> > >> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: > >>>> One thing, that you'll probably notice is this > >>>> 'set-support-level' command. Basically, it tells GA what qemu version > >>>> is it running on. Ideally, this should be done as soon as > >>>> GA starts up. However, that cannot be determined from outside > >>>> world as GA doesn't emit any events yet. > >>>> Ideally^2 this command should be left out as it should be qemu > >>>> who tells its own agent this kind of information. > >>>> Anyway, I was going to call this command in qemuProcess{Startup, > >>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > >>>> so guest can boot and start GA, but that implies returning from qemuProcess*. > >>>> > >>>> So I am setting this just before 'guest-suspend' command, as > >>>> there is one more thing about GA. It is unable to remember anything > >>>> upon its restart (GA process). Which has BTW show flaw > >>>> in our current code with FS freeze& thaw. If we freeze guest > >>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as > >>>> GA thinks FS are not frozen. But that's a different cup of tea. > >>>> > >>>> Because of what written above, we need to call set-level > >>>> on every suspend. > >>> > >>> > >>> IMHO all this says that the 'set-level' command is a conceptually > >>> unfixably broken design& should be killed in QEMU before it turns > >>> into an even bigger mess. > > > > Can you elaborate on this? Michal and I talked on irc about making the > > compatibility level persistent, would that help? > > > >>> Once we're in a situation where we need to call 'set-level' prior > >>> to every single invocation, you might as well just allow the QEMU > >>> version number to be passed in directly as an arg to the command > >>> you are running directly thus avoiding this horrificness. > >> > >> Qemu folks, would you care to chime in on this? > >> > >> Exactly how is the set-level command supposed to work? As I understand > >> it, the goal is that if the guest has qemu-ga 1.1 installed, but is > >> being run by qemu 1.0, then we want to ensure that any guest agent > >> command supported by qemu-ga 1.1 but requiring features of qemu not > >> present in qemu 1.0 will be properly rejected. > > > > Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > > default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > > way the set-support-level command allows you to specify that qemu 2.0 features > > are supported. > > Version numbers are meaningless. What happens when a bunch of features get > backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of > 2.0? > > The feature negotiation mechanism we have in QMP is the existence of a command. > If we're in a position where we're trying to disable part of a command, it > simply means that we should have multiple commands such that we can just remove > the disabled part entirely. You may have a point that we shouldn't be using the version number for that, but just switching to multiple commands doesn't solve the fundamental problem. The fundamental problem is that, S3 in current (and old) qemu has two known bugs: 1. The screen is left black after S3 (it's a bug in seabios) 2. QEMU resumes the guest immediately (Gerd posted patches to address this) We're going to address both issues in 1.1. However, if qemu-ga is installed in an old qemu and S3 is used, the bugs will be triggered. We need a way for qemu-ga to query qemu about the existence of a working S3 support. The set-support-level solves that. Another option would be to disable (or enable) S3 by default in qemu-ga, and let the admin enable (or disable it) according to S3 support being fixed in qemu. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 12:57 ` Luiz Capitulino @ 2012-01-30 13:54 ` Anthony Liguori 2012-01-30 14:44 ` Luiz Capitulino 2012-01-30 15:03 ` Michael Roth 1 sibling, 1 reply; 18+ messages in thread From: Anthony Liguori @ 2012-01-30 13:54 UTC (permalink / raw) To: Luiz Capitulino Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers, mdroth On 01/30/2012 06:57 AM, Luiz Capitulino wrote: > On Thu, 26 Jan 2012 16:57:01 -0600 > Anthony Liguori<anthony@codemonkey.ws> wrote: > >> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: >>> On Thu, 26 Jan 2012 08:18:03 -0700 >>> Eric Blake<eblake@redhat.com> wrote: >>> >>>> [adding qemu-devel] >>>> >>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>>>> One thing, that you'll probably notice is this >>>>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>>>> is it running on. Ideally, this should be done as soon as >>>>>> GA starts up. However, that cannot be determined from outside >>>>>> world as GA doesn't emit any events yet. >>>>>> Ideally^2 this command should be left out as it should be qemu >>>>>> who tells its own agent this kind of information. >>>>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>>>> >>>>>> So I am setting this just before 'guest-suspend' command, as >>>>>> there is one more thing about GA. It is unable to remember anything >>>>>> upon its restart (GA process). Which has BTW show flaw >>>>>> in our current code with FS freeze& thaw. If we freeze guest >>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>>>> >>>>>> Because of what written above, we need to call set-level >>>>>> on every suspend. >>>>> >>>>> >>>>> IMHO all this says that the 'set-level' command is a conceptually >>>>> unfixably broken design& should be killed in QEMU before it turns >>>>> into an even bigger mess. >>> >>> Can you elaborate on this? Michal and I talked on irc about making the >>> compatibility level persistent, would that help? >>> >>>>> Once we're in a situation where we need to call 'set-level' prior >>>>> to every single invocation, you might as well just allow the QEMU >>>>> version number to be passed in directly as an arg to the command >>>>> you are running directly thus avoiding this horrificness. >>>> >>>> Qemu folks, would you care to chime in on this? >>>> >>>> Exactly how is the set-level command supposed to work? As I understand >>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >>>> being run by qemu 1.0, then we want to ensure that any guest agent >>>> command supported by qemu-ga 1.1 but requiring features of qemu not >>>> present in qemu 1.0 will be properly rejected. >>> >>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by >>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This >>> way the set-support-level command allows you to specify that qemu 2.0 features >>> are supported. >> >> Version numbers are meaningless. What happens when a bunch of features get >> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of >> 2.0? >> >> The feature negotiation mechanism we have in QMP is the existence of a command. >> If we're in a position where we're trying to disable part of a command, it >> simply means that we should have multiple commands such that we can just remove >> the disabled part entirely. > > You may have a point that we shouldn't be using the version number for that, > but just switching to multiple commands doesn't solve the fundamental problem. > > The fundamental problem is that, S3 in current (and old) qemu has two known bugs: > > 1. The screen is left black after S3 (it's a bug in seabios) > 2. QEMU resumes the guest immediately (Gerd posted patches to address this) > > We're going to address both issues in 1.1. However, if qemu-ga is installed in > an old qemu and S3 is used, the bugs will be triggered. It's a management tool problem. Before a management tool issues a command, it should query the existence of the command to determine whether this version of QEMU has that capability. If the tool needs to use two commands, it should query the existence of both of them. In this case, the management tool needs a qemu-ga command *and* a QEMU command (to resume from suspend) so it should query both of them. Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 worked in QEMU as expected. Alternatively, if there really was no reason to have a resume-from-suspend command, this would be the point where we would add a capabilities command adding the "working-s3" capability. But with capabilities, this is a direct QEMU->management tool interaction, not a proxy through the guest agent. We shouldn't trust the guest agent and we certainly don't want to rely on the guest agent to avoid sending an improper command to QEMU! That would be a security issue. > > We need a way for qemu-ga to query qemu about the existence of a working S3 > support. The set-support-level solves that. qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to ask the guest to do something. If we need to interact with QEMU directly to query a capability and/or presence of a command, then we should talk to QEMU directly. To put it another way, a management tool MUST deal with the fact that when issuing the suspend-to-ram command, a guest may ignore it or attempt to do something malicious. Regards, Anthony Liguori > Another option would be to disable (or enable) S3 by default in qemu-ga, and let > the admin enable (or disable it) according to S3 support being fixed in qemu. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 13:54 ` Anthony Liguori @ 2012-01-30 14:44 ` Luiz Capitulino 2012-01-30 15:43 ` Michael Roth ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Luiz Capitulino @ 2012-01-30 14:44 UTC (permalink / raw) To: Anthony Liguori Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers, mdroth On Mon, 30 Jan 2012 07:54:56 -0600 Anthony Liguori <anthony@codemonkey.ws> wrote: > On 01/30/2012 06:57 AM, Luiz Capitulino wrote: > > On Thu, 26 Jan 2012 16:57:01 -0600 > > Anthony Liguori<anthony@codemonkey.ws> wrote: > > > >> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: > >>> On Thu, 26 Jan 2012 08:18:03 -0700 > >>> Eric Blake<eblake@redhat.com> wrote: > >>> > >>>> [adding qemu-devel] > >>>> > >>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: > >>>>>> One thing, that you'll probably notice is this > >>>>>> 'set-support-level' command. Basically, it tells GA what qemu version > >>>>>> is it running on. Ideally, this should be done as soon as > >>>>>> GA starts up. However, that cannot be determined from outside > >>>>>> world as GA doesn't emit any events yet. > >>>>>> Ideally^2 this command should be left out as it should be qemu > >>>>>> who tells its own agent this kind of information. > >>>>>> Anyway, I was going to call this command in qemuProcess{Startup, > >>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > >>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. > >>>>>> > >>>>>> So I am setting this just before 'guest-suspend' command, as > >>>>>> there is one more thing about GA. It is unable to remember anything > >>>>>> upon its restart (GA process). Which has BTW show flaw > >>>>>> in our current code with FS freeze& thaw. If we freeze guest > >>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as > >>>>>> GA thinks FS are not frozen. But that's a different cup of tea. > >>>>>> > >>>>>> Because of what written above, we need to call set-level > >>>>>> on every suspend. > >>>>> > >>>>> > >>>>> IMHO all this says that the 'set-level' command is a conceptually > >>>>> unfixably broken design& should be killed in QEMU before it turns > >>>>> into an even bigger mess. > >>> > >>> Can you elaborate on this? Michal and I talked on irc about making the > >>> compatibility level persistent, would that help? > >>> > >>>>> Once we're in a situation where we need to call 'set-level' prior > >>>>> to every single invocation, you might as well just allow the QEMU > >>>>> version number to be passed in directly as an arg to the command > >>>>> you are running directly thus avoiding this horrificness. > >>>> > >>>> Qemu folks, would you care to chime in on this? > >>>> > >>>> Exactly how is the set-level command supposed to work? As I understand > >>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is > >>>> being run by qemu 1.0, then we want to ensure that any guest agent > >>>> command supported by qemu-ga 1.1 but requiring features of qemu not > >>>> present in qemu 1.0 will be properly rejected. > >>> > >>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > >>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > >>> way the set-support-level command allows you to specify that qemu 2.0 features > >>> are supported. > >> > >> Version numbers are meaningless. What happens when a bunch of features get > >> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of > >> 2.0? > >> > >> The feature negotiation mechanism we have in QMP is the existence of a command. > >> If we're in a position where we're trying to disable part of a command, it > >> simply means that we should have multiple commands such that we can just remove > >> the disabled part entirely. > > > > You may have a point that we shouldn't be using the version number for that, > > but just switching to multiple commands doesn't solve the fundamental problem. > > > > The fundamental problem is that, S3 in current (and old) qemu has two known bugs: > > > > 1. The screen is left black after S3 (it's a bug in seabios) > > 2. QEMU resumes the guest immediately (Gerd posted patches to address this) > > > > We're going to address both issues in 1.1. However, if qemu-ga is installed in > > an old qemu and S3 is used, the bugs will be triggered. > > It's a management tool problem. > > Before a management tool issues a command, it should query the existence of the > command to determine whether this version of QEMU has that capability. If the > tool needs to use two commands, it should query the existence of both of them. > > In this case, the management tool needs a qemu-ga command *and* a QEMU command > (to resume from suspend) so it should query both of them. > > Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 > worked in QEMU as expected. That's right, it's a coincidence, but I don't see why this wouldn't work. I think we should do the following then: 1. Drop the set-support-level command 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, guest-suspend-disk 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing the guest-suspend-ram command Michal, Michael, do you agree? > Alternatively, if there really was no reason to have a resume-from-suspend > command, this would be the point where we would add a capabilities command > adding the "working-s3" capability. > > But with capabilities, this is a direct QEMU->management tool interaction, not a > proxy through the guest agent. > > We shouldn't trust the guest agent and we certainly don't want to rely on the > guest agent to avoid sending an improper command to QEMU! That would be a > security issue. Fair enough, although that makes me wonder if the planned feature of pulling qemu-ga's commands into the QMP namespace won't have similar security issues. > > > > > We need a way for qemu-ga to query qemu about the existence of a working S3 > > support. The set-support-level solves that. > > qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to > ask the guest to do something. If we need to interact with QEMU directly to > query a capability and/or presence of a command, then we should talk to QEMU > directly. > > To put it another way, a management tool MUST deal with the fact that when > issuing the suspend-to-ram command, a guest may ignore it or attempt to do > something malicious. > > Regards, > > Anthony Liguori > > > > Another option would be to disable (or enable) S3 by default in qemu-ga, and let > > the admin enable (or disable it) according to S3 support being fixed in qemu. > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 14:44 ` Luiz Capitulino @ 2012-01-30 15:43 ` Michael Roth 2012-01-30 15:58 ` Eric Blake 2012-01-30 16:08 ` Michal Privoznik 2 siblings, 0 replies; 18+ messages in thread From: Michael Roth @ 2012-01-30 15:43 UTC (permalink / raw) To: Luiz Capitulino Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers On 01/30/2012 08:44 AM, Luiz Capitulino wrote: > On Mon, 30 Jan 2012 07:54:56 -0600 > Anthony Liguori<anthony@codemonkey.ws> wrote: > >> On 01/30/2012 06:57 AM, Luiz Capitulino wrote: >>> On Thu, 26 Jan 2012 16:57:01 -0600 >>> Anthony Liguori<anthony@codemonkey.ws> wrote: >>> >>>> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: >>>>> On Thu, 26 Jan 2012 08:18:03 -0700 >>>>> Eric Blake<eblake@redhat.com> wrote: >>>>> >>>>>> [adding qemu-devel] >>>>>> >>>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>>>>>> One thing, that you'll probably notice is this >>>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>>>>>> is it running on. Ideally, this should be done as soon as >>>>>>>> GA starts up. However, that cannot be determined from outside >>>>>>>> world as GA doesn't emit any events yet. >>>>>>>> Ideally^2 this command should be left out as it should be qemu >>>>>>>> who tells its own agent this kind of information. >>>>>>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>>>>>> >>>>>>>> So I am setting this just before 'guest-suspend' command, as >>>>>>>> there is one more thing about GA. It is unable to remember anything >>>>>>>> upon its restart (GA process). Which has BTW show flaw >>>>>>>> in our current code with FS freeze& thaw. If we freeze guest >>>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>>>>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>>>>>> >>>>>>>> Because of what written above, we need to call set-level >>>>>>>> on every suspend. >>>>>>> >>>>>>> >>>>>>> IMHO all this says that the 'set-level' command is a conceptually >>>>>>> unfixably broken design& should be killed in QEMU before it turns >>>>>>> into an even bigger mess. >>>>> >>>>> Can you elaborate on this? Michal and I talked on irc about making the >>>>> compatibility level persistent, would that help? >>>>> >>>>>>> Once we're in a situation where we need to call 'set-level' prior >>>>>>> to every single invocation, you might as well just allow the QEMU >>>>>>> version number to be passed in directly as an arg to the command >>>>>>> you are running directly thus avoiding this horrificness. >>>>>> >>>>>> Qemu folks, would you care to chime in on this? >>>>>> >>>>>> Exactly how is the set-level command supposed to work? As I understand >>>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >>>>>> being run by qemu 1.0, then we want to ensure that any guest agent >>>>>> command supported by qemu-ga 1.1 but requiring features of qemu not >>>>>> present in qemu 1.0 will be properly rejected. >>>>> >>>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by >>>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This >>>>> way the set-support-level command allows you to specify that qemu 2.0 features >>>>> are supported. >>>> >>>> Version numbers are meaningless. What happens when a bunch of features get >>>> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of >>>> 2.0? >>>> >>>> The feature negotiation mechanism we have in QMP is the existence of a command. >>>> If we're in a position where we're trying to disable part of a command, it >>>> simply means that we should have multiple commands such that we can just remove >>>> the disabled part entirely. >>> >>> You may have a point that we shouldn't be using the version number for that, >>> but just switching to multiple commands doesn't solve the fundamental problem. >>> >>> The fundamental problem is that, S3 in current (and old) qemu has two known bugs: >>> >>> 1. The screen is left black after S3 (it's a bug in seabios) >>> 2. QEMU resumes the guest immediately (Gerd posted patches to address this) >>> >>> We're going to address both issues in 1.1. However, if qemu-ga is installed in >>> an old qemu and S3 is used, the bugs will be triggered. >> >> It's a management tool problem. >> >> Before a management tool issues a command, it should query the existence of the >> command to determine whether this version of QEMU has that capability. If the >> tool needs to use two commands, it should query the existence of both of them. >> >> In this case, the management tool needs a qemu-ga command *and* a QEMU command >> (to resume from suspend) so it should query both of them. >> >> Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 >> worked in QEMU as expected. > > That's right, it's a coincidence, but I don't see why this wouldn't work. > > I think we should do the following then: > > 1. Drop the set-support-level command > 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, > guest-suspend-disk > 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing > the guest-suspend-ram command > > Michal, Michael, do you agree? Yup yup, looks sound to me. > >> Alternatively, if there really was no reason to have a resume-from-suspend >> command, this would be the point where we would add a capabilities command >> adding the "working-s3" capability. >> >> But with capabilities, this is a direct QEMU->management tool interaction, not a >> proxy through the guest agent. >> >> We shouldn't trust the guest agent and we certainly don't want to rely on the >> guest agent to avoid sending an improper command to QEMU! That would be a >> security issue. > > Fair enough, although that makes me wonder if the planned feature of pulling > qemu-ga's commands into the QMP namespace won't have similar security issues. > >> >>> >>> We need a way for qemu-ga to query qemu about the existence of a working S3 >>> support. The set-support-level solves that. >> >> qemu-ga is not an entry point for QEMU features. It's strictly a mechanism to >> ask the guest to do something. If we need to interact with QEMU directly to >> query a capability and/or presence of a command, then we should talk to QEMU >> directly. >> >> To put it another way, a management tool MUST deal with the fact that when >> issuing the suspend-to-ram command, a guest may ignore it or attempt to do >> something malicious. >> >> Regards, >> >> Anthony Liguori >> >> >>> Another option would be to disable (or enable) S3 by default in qemu-ga, and let >>> the admin enable (or disable it) according to S3 support being fixed in qemu. >>> >> > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 14:44 ` Luiz Capitulino 2012-01-30 15:43 ` Michael Roth @ 2012-01-30 15:58 ` Eric Blake 2012-01-30 17:07 ` Michael Roth 2012-01-30 16:08 ` Michal Privoznik 2 siblings, 1 reply; 18+ messages in thread From: Eric Blake @ 2012-01-30 15:58 UTC (permalink / raw) To: Luiz Capitulino; +Cc: libvir-list, Michal Privoznik, QEMU Developers, mdroth [-- Attachment #1: Type: text/plain, Size: 1618 bytes --] On 01/30/2012 07:44 AM, Luiz Capitulino wrote: > I think we should do the following then: > > 1. Drop the set-support-level command > 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, > guest-suspend-disk > 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing > the guest-suspend-ram command > > Michal, Michael, do you agree? I'm not Michal, but speaking for libvirt, this definitely sounds like the way to go. Questions: Should libvirt also check for 'wakeup' before attempting guest-suspend-hybrid? With guest-suspend-disk, what happens when the suspend completes? Does this look like a normal case of the guest powering down, which qemu then passes as an event to libvirt and libvirt then ends the qemu process? That would mean that the only difference from a normal guest shutdown is that on the next guest boot the guest's disk image allows to recover state from disk rather than booting from scratch; either way, a new qemu process is created to resume the guest, and qemu is doing nothing different in how it starts the guest (it's just that the guest itself does something different based on what it stored into the disk images before shutting down). Also, I know there has been talk about a qemu-ga command to resync clocks after a resume from S3 and/or 'loadvm'; is this command fully in place yet, and is it another command that libvirt should be checking for prior to allowing any S3 attempts? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 620 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 15:58 ` Eric Blake @ 2012-01-30 17:07 ` Michael Roth 2012-01-30 18:30 ` Luiz Capitulino 0 siblings, 1 reply; 18+ messages in thread From: Michael Roth @ 2012-01-30 17:07 UTC (permalink / raw) To: Eric Blake Cc: libvir-list, Michal Privoznik, QEMU Developers, Luiz Capitulino On 01/30/2012 09:58 AM, Eric Blake wrote: > On 01/30/2012 07:44 AM, Luiz Capitulino wrote: >> I think we should do the following then: >> >> 1. Drop the set-support-level command >> 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, >> guest-suspend-disk >> 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing >> the guest-suspend-ram command >> >> Michal, Michael, do you agree? > > I'm not Michal, but speaking for libvirt, this definitely sounds like > the way to go. > > Questions: > > Should libvirt also check for 'wakeup' before attempting > guest-suspend-hybrid? > > With guest-suspend-disk, what happens when the suspend completes? Does > this look like a normal case of the guest powering down, which qemu then > passes as an event to libvirt and libvirt then ends the qemu process? > That would mean that the only difference from a normal guest shutdown is > that on the next guest boot the guest's disk image allows to recover > state from disk rather than booting from scratch; either way, a new qemu > process is created to resume the guest, and qemu is doing nothing > different in how it starts the guest (it's just that the guest itself > does something different based on what it stored into the disk images > before shutting down). > > Also, I know there has been talk about a qemu-ga command to resync > clocks after a resume from S3 and/or 'loadvm'; is this command fully in > place yet, and is it another command that libvirt should be checking for > prior to allowing any S3 attempts? > Hi Eric, I'm not aware of a clock re-sync command being worked on.. are we maybe talking about the guest-sync command qemu-ga currently has in place to re-sync the protocol stream after a client-side timeout? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 17:07 ` Michael Roth @ 2012-01-30 18:30 ` Luiz Capitulino 0 siblings, 0 replies; 18+ messages in thread From: Luiz Capitulino @ 2012-01-30 18:30 UTC (permalink / raw) To: Michael Roth; +Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers On Mon, 30 Jan 2012 11:07:10 -0600 Michael Roth <mdroth@linux.vnet.ibm.com> wrote: > On 01/30/2012 09:58 AM, Eric Blake wrote: > > On 01/30/2012 07:44 AM, Luiz Capitulino wrote: > >> I think we should do the following then: > >> > >> 1. Drop the set-support-level command > >> 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, > >> guest-suspend-disk > >> 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing > >> the guest-suspend-ram command > >> > >> Michal, Michael, do you agree? > > > > I'm not Michal, but speaking for libvirt, this definitely sounds like > > the way to go. > > > > Questions: > > > > Should libvirt also check for 'wakeup' before attempting > > guest-suspend-hybrid? Yes. > > With guest-suspend-disk, what happens when the suspend completes? Does > > this look like a normal case of the guest powering down, which qemu then > > passes as an event to libvirt and libvirt then ends the qemu process? Yes. > > That would mean that the only difference from a normal guest shutdown is > > that on the next guest boot the guest's disk image allows to recover > > state from disk rather than booting from scratch; either way, a new qemu > > process is created to resume the guest, and qemu is doing nothing > > different in how it starts the guest (it's just that the guest itself > > does something different based on what it stored into the disk images > > before shutting down). Exactly. > > Also, I know there has been talk about a qemu-ga command to resync > > clocks after a resume from S3 and/or 'loadvm'; is this command fully in > > place yet, and is it another command that libvirt should be checking for > > prior to allowing any S3 attempts? > > > > Hi Eric, > > I'm not aware of a clock re-sync command being worked on.. are we maybe > talking about the guest-sync command qemu-ga currently has in place to > re-sync the protocol stream after a client-side timeout? I've heard some conversations about doing what Eric is saying, but I don't have any details either. Eric, do you know whom is assigned to work on that on the qemu side? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 14:44 ` Luiz Capitulino 2012-01-30 15:43 ` Michael Roth 2012-01-30 15:58 ` Eric Blake @ 2012-01-30 16:08 ` Michal Privoznik 2012-01-30 18:36 ` Luiz Capitulino 2 siblings, 1 reply; 18+ messages in thread From: Michal Privoznik @ 2012-01-30 16:08 UTC (permalink / raw) To: Luiz Capitulino; +Cc: libvir-list, Eric Blake, QEMU Developers, mdroth On 30.01.2012 15:44, Luiz Capitulino wrote: > On Mon, 30 Jan 2012 07:54:56 -0600 > Anthony Liguori <anthony@codemonkey.ws> wrote: > >> On 01/30/2012 06:57 AM, Luiz Capitulino wrote: >>> On Thu, 26 Jan 2012 16:57:01 -0600 >>> Anthony Liguori<anthony@codemonkey.ws> wrote: >>> >>>> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: >>>>> On Thu, 26 Jan 2012 08:18:03 -0700 >>>>> Eric Blake<eblake@redhat.com> wrote: >>>>> >>>>>> [adding qemu-devel] >>>>>> >>>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>>>>>> One thing, that you'll probably notice is this >>>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>>>>>> is it running on. Ideally, this should be done as soon as >>>>>>>> GA starts up. However, that cannot be determined from outside >>>>>>>> world as GA doesn't emit any events yet. >>>>>>>> Ideally^2 this command should be left out as it should be qemu >>>>>>>> who tells its own agent this kind of information. >>>>>>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>>>>>> >>>>>>>> So I am setting this just before 'guest-suspend' command, as >>>>>>>> there is one more thing about GA. It is unable to remember anything >>>>>>>> upon its restart (GA process). Which has BTW show flaw >>>>>>>> in our current code with FS freeze& thaw. If we freeze guest >>>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>>>>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>>>>>> >>>>>>>> Because of what written above, we need to call set-level >>>>>>>> on every suspend. >>>>>>> >>>>>>> >>>>>>> IMHO all this says that the 'set-level' command is a conceptually >>>>>>> unfixably broken design& should be killed in QEMU before it turns >>>>>>> into an even bigger mess. >>>>> >>>>> Can you elaborate on this? Michal and I talked on irc about making the >>>>> compatibility level persistent, would that help? >>>>> >>>>>>> Once we're in a situation where we need to call 'set-level' prior >>>>>>> to every single invocation, you might as well just allow the QEMU >>>>>>> version number to be passed in directly as an arg to the command >>>>>>> you are running directly thus avoiding this horrificness. >>>>>> >>>>>> Qemu folks, would you care to chime in on this? >>>>>> >>>>>> Exactly how is the set-level command supposed to work? As I understand >>>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >>>>>> being run by qemu 1.0, then we want to ensure that any guest agent >>>>>> command supported by qemu-ga 1.1 but requiring features of qemu not >>>>>> present in qemu 1.0 will be properly rejected. >>>>> >>>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by >>>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This >>>>> way the set-support-level command allows you to specify that qemu 2.0 features >>>>> are supported. >>>> >>>> Version numbers are meaningless. What happens when a bunch of features get >>>> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of >>>> 2.0? >>>> >>>> The feature negotiation mechanism we have in QMP is the existence of a command. >>>> If we're in a position where we're trying to disable part of a command, it >>>> simply means that we should have multiple commands such that we can just remove >>>> the disabled part entirely. >>> >>> You may have a point that we shouldn't be using the version number for that, >>> but just switching to multiple commands doesn't solve the fundamental problem. >>> >>> The fundamental problem is that, S3 in current (and old) qemu has two known bugs: >>> >>> 1. The screen is left black after S3 (it's a bug in seabios) >>> 2. QEMU resumes the guest immediately (Gerd posted patches to address this) >>> >>> We're going to address both issues in 1.1. However, if qemu-ga is installed in >>> an old qemu and S3 is used, the bugs will be triggered. >> >> It's a management tool problem. >> >> Before a management tool issues a command, it should query the existence of the >> command to determine whether this version of QEMU has that capability. If the >> tool needs to use two commands, it should query the existence of both of them. >> >> In this case, the management tool needs a qemu-ga command *and* a QEMU command >> (to resume from suspend) so it should query both of them. >> >> Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 >> worked in QEMU as expected. > > That's right, it's a coincidence, but I don't see why this wouldn't work. > > I think we should do the following then: > > 1. Drop the set-support-level command > 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, > guest-suspend-disk > 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing > the guest-suspend-ram command > > Michal, Michael, do you agree? I don't see into qemu internals, but <like> to dropping set-support-level for sure. From my POV, splitting may solve the problem. But how hard would it be to fix suspend to ram in qemu, so libvirt doesn't have to query for wakeup command? It would be nice if we have easy to use interface. Michal ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 16:08 ` Michal Privoznik @ 2012-01-30 18:36 ` Luiz Capitulino 0 siblings, 0 replies; 18+ messages in thread From: Luiz Capitulino @ 2012-01-30 18:36 UTC (permalink / raw) To: Michal Privoznik; +Cc: libvir-list, Eric Blake, QEMU Developers, mdroth On Mon, 30 Jan 2012 17:08:33 +0100 Michal Privoznik <mprivozn@redhat.com> wrote: > On 30.01.2012 15:44, Luiz Capitulino wrote: > > On Mon, 30 Jan 2012 07:54:56 -0600 > > Anthony Liguori <anthony@codemonkey.ws> wrote: > > > >> On 01/30/2012 06:57 AM, Luiz Capitulino wrote: > >>> On Thu, 26 Jan 2012 16:57:01 -0600 > >>> Anthony Liguori<anthony@codemonkey.ws> wrote: > >>> > >>>> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: > >>>>> On Thu, 26 Jan 2012 08:18:03 -0700 > >>>>> Eric Blake<eblake@redhat.com> wrote: > >>>>> > >>>>>> [adding qemu-devel] > >>>>>> > >>>>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: > >>>>>>>> One thing, that you'll probably notice is this > >>>>>>>> 'set-support-level' command. Basically, it tells GA what qemu version > >>>>>>>> is it running on. Ideally, this should be done as soon as > >>>>>>>> GA starts up. However, that cannot be determined from outside > >>>>>>>> world as GA doesn't emit any events yet. > >>>>>>>> Ideally^2 this command should be left out as it should be qemu > >>>>>>>> who tells its own agent this kind of information. > >>>>>>>> Anyway, I was going to call this command in qemuProcess{Startup, > >>>>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs > >>>>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. > >>>>>>>> > >>>>>>>> So I am setting this just before 'guest-suspend' command, as > >>>>>>>> there is one more thing about GA. It is unable to remember anything > >>>>>>>> upon its restart (GA process). Which has BTW show flaw > >>>>>>>> in our current code with FS freeze& thaw. If we freeze guest > >>>>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as > >>>>>>>> GA thinks FS are not frozen. But that's a different cup of tea. > >>>>>>>> > >>>>>>>> Because of what written above, we need to call set-level > >>>>>>>> on every suspend. > >>>>>>> > >>>>>>> > >>>>>>> IMHO all this says that the 'set-level' command is a conceptually > >>>>>>> unfixably broken design& should be killed in QEMU before it turns > >>>>>>> into an even bigger mess. > >>>>> > >>>>> Can you elaborate on this? Michal and I talked on irc about making the > >>>>> compatibility level persistent, would that help? > >>>>> > >>>>>>> Once we're in a situation where we need to call 'set-level' prior > >>>>>>> to every single invocation, you might as well just allow the QEMU > >>>>>>> version number to be passed in directly as an arg to the command > >>>>>>> you are running directly thus avoiding this horrificness. > >>>>>> > >>>>>> Qemu folks, would you care to chime in on this? > >>>>>> > >>>>>> Exactly how is the set-level command supposed to work? As I understand > >>>>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is > >>>>>> being run by qemu 1.0, then we want to ensure that any guest agent > >>>>>> command supported by qemu-ga 1.1 but requiring features of qemu not > >>>>>> present in qemu 1.0 will be properly rejected. > >>>>> > >>>>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by > >>>>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This > >>>>> way the set-support-level command allows you to specify that qemu 2.0 features > >>>>> are supported. > >>>> > >>>> Version numbers are meaningless. What happens when a bunch of features get > >>>> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of > >>>> 2.0? > >>>> > >>>> The feature negotiation mechanism we have in QMP is the existence of a command. > >>>> If we're in a position where we're trying to disable part of a command, it > >>>> simply means that we should have multiple commands such that we can just remove > >>>> the disabled part entirely. > >>> > >>> You may have a point that we shouldn't be using the version number for that, > >>> but just switching to multiple commands doesn't solve the fundamental problem. > >>> > >>> The fundamental problem is that, S3 in current (and old) qemu has two known bugs: > >>> > >>> 1. The screen is left black after S3 (it's a bug in seabios) > >>> 2. QEMU resumes the guest immediately (Gerd posted patches to address this) > >>> > >>> We're going to address both issues in 1.1. However, if qemu-ga is installed in > >>> an old qemu and S3 is used, the bugs will be triggered. > >> > >> It's a management tool problem. > >> > >> Before a management tool issues a command, it should query the existence of the > >> command to determine whether this version of QEMU has that capability. If the > >> tool needs to use two commands, it should query the existence of both of them. > >> > >> In this case, the management tool needs a qemu-ga command *and* a QEMU command > >> (to resume from suspend) so it should query both of them. > >> > >> Obviously, we wouldn't have a resume-from-suspend command in QEMU unless it S3 > >> worked in QEMU as expected. > > > > That's right, it's a coincidence, but I don't see why this wouldn't work. > > > > I think we should do the following then: > > > > 1. Drop the set-support-level command > > 2. Split the guest-suspend command into guest-suspend-ram, guest-suspend-hybrid, > > guest-suspend-disk > > 3. Libvirt should query for _QEMU_'s 'wakeup' command before issuing > > the guest-suspend-ram command > > > > Michal, Michael, do you agree? > > I don't see into qemu internals, but <like> to dropping > set-support-level for sure. From my POV, splitting may solve the > problem. But how hard would it be to fix suspend to ram in qemu, so > libvirt doesn't have to query for wakeup command? It would be nice if we > have easy to use interface. libivrt will have to always query for the wakeup command, because that's the only way to know if the qemu libvirt is talking to supports suspend to ram. Of course that you should always query for a command before issuing it too (you can use the guest-info command for that), because qemu-ga commands may be unavailable. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-30 12:57 ` Luiz Capitulino 2012-01-30 13:54 ` Anthony Liguori @ 2012-01-30 15:03 ` Michael Roth 1 sibling, 0 replies; 18+ messages in thread From: Michael Roth @ 2012-01-30 15:03 UTC (permalink / raw) To: Luiz Capitulino Cc: libvir-list, Michal Privoznik, Eric Blake, QEMU Developers On 01/30/2012 06:57 AM, Luiz Capitulino wrote: > On Thu, 26 Jan 2012 16:57:01 -0600 > Anthony Liguori<anthony@codemonkey.ws> wrote: > >> On 01/26/2012 01:35 PM, Luiz Capitulino wrote: >>> On Thu, 26 Jan 2012 08:18:03 -0700 >>> Eric Blake<eblake@redhat.com> wrote: >>> >>>> [adding qemu-devel] >>>> >>>> On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>>>>> One thing, that you'll probably notice is this >>>>>> 'set-support-level' command. Basically, it tells GA what qemu version >>>>>> is it running on. Ideally, this should be done as soon as >>>>>> GA starts up. However, that cannot be determined from outside >>>>>> world as GA doesn't emit any events yet. >>>>>> Ideally^2 this command should be left out as it should be qemu >>>>>> who tells its own agent this kind of information. >>>>>> Anyway, I was going to call this command in qemuProcess{Startup, >>>>>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>>>>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>>>>> >>>>>> So I am setting this just before 'guest-suspend' command, as >>>>>> there is one more thing about GA. It is unable to remember anything >>>>>> upon its restart (GA process). Which has BTW show flaw >>>>>> in our current code with FS freeze& thaw. If we freeze guest >>>>>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>>>>> GA thinks FS are not frozen. But that's a different cup of tea. >>>>>> >>>>>> Because of what written above, we need to call set-level >>>>>> on every suspend. >>>>> >>>>> >>>>> IMHO all this says that the 'set-level' command is a conceptually >>>>> unfixably broken design& should be killed in QEMU before it turns >>>>> into an even bigger mess. >>> >>> Can you elaborate on this? Michal and I talked on irc about making the >>> compatibility level persistent, would that help? >>> >>>>> Once we're in a situation where we need to call 'set-level' prior >>>>> to every single invocation, you might as well just allow the QEMU >>>>> version number to be passed in directly as an arg to the command >>>>> you are running directly thus avoiding this horrificness. >>>> >>>> Qemu folks, would you care to chime in on this? >>>> >>>> Exactly how is the set-level command supposed to work? As I understand >>>> it, the goal is that if the guest has qemu-ga 1.1 installed, but is >>>> being run by qemu 1.0, then we want to ensure that any guest agent >>>> command supported by qemu-ga 1.1 but requiring features of qemu not >>>> present in qemu 1.0 will be properly rejected. >>> >>> Not exactly, the default support of qemu-ga is qemu 1.0. This means that by >>> default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This >>> way the set-support-level command allows you to specify that qemu 2.0 features >>> are supported. >> >> Version numbers are meaningless. What happens when a bunch of features get >> backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of >> 2.0? >> >> The feature negotiation mechanism we have in QMP is the existence of a command. >> If we're in a position where we're trying to disable part of a command, it >> simply means that we should have multiple commands such that we can just remove >> the disabled part entirely. > > You may have a point that we shouldn't be using the version number for that, > but just switching to multiple commands doesn't solve the fundamental problem. Agreed, but the multiple commands isn't really the fix here, it's libvirt querying for the "wakeup" command that Gerd's patches add. We implemented set-version-level as a way to let management tools obliviously report something simple, like the version information it parses from QEMU already, to let the guest agent Do The Right Thing without any deep insight into what it's host requirements are (which is also why we opted for versions over specific capabilities flags). But we can't rely on version levels applying in all cases, a distro might backport s3 support for 1.0, modify the guest agent version level dependencies accordingly, and thus render the that agent incompatible with, say, RHEL. So it was a naive approach on my part, and the issues the libvirt folks noted with not knowing if a reset occurred since the last set-support-level make it even less desirable (we could persist it as you suggested, and I am working on patches to persist fsfreeze state, but it's not worth trying to fix set-support-level). And at least in this case we have an easy out: libvirt knows it can't resume a guest without the presence of a QMP command to do so, and that doesn't require any intimate knowledge of how the agent works, it's just common sense. Breaking the suspend/hibernate stuff into multiple commands avoids the need for libvirt to special case based on specific parameters to the command. We can also imagine distro-specific implementations of the agent disabling s4 due to things like not having s4 patches for virtio in place, so it makes it easier (possible) to discover those situations as well. It may be that we opt for a single command for libvirt, but at least on the agent side it should be multiple discoverable ones. Hopefully this doesn't set anyone back too much :( IMHO it's the right way to go though. > > The fundamental problem is that, S3 in current (and old) qemu has two known bugs: > > 1. The screen is left black after S3 (it's a bug in seabios) > 2. QEMU resumes the guest immediately (Gerd posted patches to address this) > > We're going to address both issues in 1.1. However, if qemu-ga is installed in > an old qemu and S3 is used, the bugs will be triggered. > > We need a way for qemu-ga to query qemu about the existence of a working S3 > support. The set-support-level solves that. > > Another option would be to disable (or enable) S3 by default in qemu-ga, and let > the admin enable (or disable it) according to S3 support being fixed in qemu. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 15:18 ` [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests Eric Blake 2012-01-26 19:35 ` Luiz Capitulino @ 2012-01-26 22:54 ` Anthony Liguori 2012-01-27 0:01 ` Michael Roth 2 siblings, 0 replies; 18+ messages in thread From: Anthony Liguori @ 2012-01-26 22:54 UTC (permalink / raw) To: Eric Blake; +Cc: libvir-list, Michal Privoznik, QEMU Developers On 01/26/2012 09:18 AM, Eric Blake wrote: > [adding qemu-devel] > > On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>> One thing, that you'll probably notice is this >>> 'set-support-level' command. Basically, it tells GA what qemu version >>> is it running on. Ideally, this should be done as soon as >>> GA starts up. However, that cannot be determined from outside >>> world as GA doesn't emit any events yet. >>> Ideally^2 this command should be left out as it should be qemu >>> who tells its own agent this kind of information. >>> Anyway, I was going to call this command in qemuProcess{Startup, >>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>> >>> So I am setting this just before 'guest-suspend' command, as >>> there is one more thing about GA. It is unable to remember anything >>> upon its restart (GA process). Which has BTW show flaw >>> in our current code with FS freeze& thaw. If we freeze guest >>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>> GA thinks FS are not frozen. But that's a different cup of tea. >>> >>> Because of what written above, we need to call set-level >>> on every suspend. >> >> >> IMHO all this says that the 'set-level' command is a conceptually >> unfixably broken design& should be killed in QEMU before it turns >> into an even bigger mess. >> >> Once we're in a situation where we need to call 'set-level' prior >> to every single invocation, you might as well just allow the QEMU >> version number to be passed in directly as an arg to the command >> you are running directly thus avoiding this horrificness. > > Qemu folks, would you care to chime in on this? Big Ack on my part. I told Mike this afternoon that I wasn't going to take the pull request with this command in it. The fundamental problem here is simple--untested code is broken code. Until we introduce a resume from suspend command (such that we can test the guest agent suspend command), we shouldn't be implementing a guest-agent suspend command. As far as I can tell, the only reason we're introducing it is because we're trying to add a multiplexed command that does suspend to ram and suspend to disk. Since it's multiplexed, it's an all-or-nothing introduction. We're then adding a side-interface to attempt to deal work around that. Let's not introduce a multiplexed command in the first place. Here's what I suggest: 1) Throw away set-level interface. 2) Introduce a suspend-to-disk command. 3) Plan to introduce a suspend-to-ram command, but I won't pull it until we have the ability to test it successfully (which means we probably need a resume-from-ram command for QMP). 4) libvirt can probe the existence of suspend-to-disk in the guest agent and act accordingly. 5) To implement virDomainSuspendToRam (or whatever it will be called), libvirt should: a) check if the guest agent command 'suspend-to-ram' exists b) check if the QMP command 'resume-from-ram' exists 6) The recommendation of (5) should be prominently documented in qapi-schema.json 7) In order for libvirt to start implementing (5), we should stub out (3) in qapi-schema-guest.json but set gen=False. That commits us to the interface without actually introducing the command. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests 2012-01-26 15:18 ` [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests Eric Blake 2012-01-26 19:35 ` Luiz Capitulino 2012-01-26 22:54 ` Anthony Liguori @ 2012-01-27 0:01 ` Michael Roth 2 siblings, 0 replies; 18+ messages in thread From: Michael Roth @ 2012-01-27 0:01 UTC (permalink / raw) To: Eric Blake; +Cc: libvir-list, Michal Privoznik, QEMU Developers On 01/26/2012 09:18 AM, Eric Blake wrote: > [adding qemu-devel] > > On 01/26/2012 07:46 AM, Daniel P. Berrange wrote: >>> One thing, that you'll probably notice is this >>> 'set-support-level' command. Basically, it tells GA what qemu version >>> is it running on. Ideally, this should be done as soon as >>> GA starts up. However, that cannot be determined from outside >>> world as GA doesn't emit any events yet. >>> Ideally^2 this command should be left out as it should be qemu >>> who tells its own agent this kind of information. >>> Anyway, I was going to call this command in qemuProcess{Startup, >>> Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs >>> so guest can boot and start GA, but that implies returning from qemuProcess*. >>> >>> So I am setting this just before 'guest-suspend' command, as >>> there is one more thing about GA. It is unable to remember anything >>> upon its restart (GA process). Which has BTW show flaw >>> in our current code with FS freeze& thaw. If we freeze guest >>> FS, and somebody restart GA, the simple FS Thaw will not succeed as >>> GA thinks FS are not frozen. But that's a different cup of tea. We could drop the state tracking (guest-fsfreeze-status) and issue the freeze/thaw unconditionally. Talked with Anthony and going stateless in general seems to solve a lot of nastiness that might pop up with the current implementation. I'll send some patches out soon for fsfreeze, guest-file* may end up getting similar treatment in the not-too-distant-future. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2012-01-30 18:37 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <cover.1327585806.git.mprivozn@redhat.com> [not found] ` <20120126144632.GM21211@redhat.com> 2012-01-26 15:18 ` [Qemu-devel] [libvirt] [PATCH RFC 0/4] Allow hibernation on guests Eric Blake 2012-01-26 19:35 ` Luiz Capitulino 2012-01-26 19:41 ` Michal Privoznik 2012-01-26 20:13 ` Luiz Capitulino 2012-01-26 22:51 ` Michael Roth 2012-01-26 22:57 ` Anthony Liguori 2012-01-30 12:57 ` Luiz Capitulino 2012-01-30 13:54 ` Anthony Liguori 2012-01-30 14:44 ` Luiz Capitulino 2012-01-30 15:43 ` Michael Roth 2012-01-30 15:58 ` Eric Blake 2012-01-30 17:07 ` Michael Roth 2012-01-30 18:30 ` Luiz Capitulino 2012-01-30 16:08 ` Michal Privoznik 2012-01-30 18:36 ` Luiz Capitulino 2012-01-30 15:03 ` Michael Roth 2012-01-26 22:54 ` Anthony Liguori 2012-01-27 0:01 ` Michael Roth
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).