From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Tomoki Sekiyama <tomoki.sekiyama@hds.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "libaiqing@huawei.com" <libaiqing@huawei.com>,
"ghammer@redhat.com" <ghammer@redhat.com>,
"stefanha@gmail.com" <stefanha@gmail.com>,
"lcapitulino@redhat.com" <lcapitulino@redhat.com>,
"vrozenfe@redhat.com" <vrozenfe@redhat.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
Seiji Aguchi <seiji.aguchi@hds.com>,
"lersek@redhat.com" <lersek@redhat.com>,
"areis@redhat.com" <areis@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v8 07/10] qemu-ga: Add Windows VSS provider and requester as DLL
Date: Tue, 30 Jul 2013 17:24:54 -0500 [thread overview]
Message-ID: <20130730222454.14585.82193@loki> (raw)
In-Reply-To: <CE1D8F01.4BD2%Tomoki.Sekiyama@hds.com>
Quoting Tomoki Sekiyama (2013-07-30 15:54:05)
> On 7/30/13 15:35 , "Michael Roth" <mdroth@linux.vnet.ibm.com> wrote:
>
> >>>One small issue I noticed was that this error will get overwritten
> >>>with the VSS writer timeout error if we wait longer than 60s before
> >>>calling guest-fsfreeze-thaw. It might give some users false assurances
> >>>about what aspects of their snapshot may be volatile so it's
> >>>probably worth addressing.
> >>This is an error returned against guest-fsfreeze-freeze, when the
> >>writers or filesystems take more than 60s to quiesce.
> >>(CQGAVssProvider::CommitSnapshots that issues FrozenEvent is
> >>called after this quiescing succeeded.) The VSS sequence is aborted
> >>at "out:". If this happens, as the system remains thawed state, the
> >>following guest-fsfreeze-thaw will just return 0.
> >
> >This is the example I'm referring to:
> >
> >{'execute':'guest-fsfreeze-freeze'}
> >{"return": 2}
> >/* wait 10+ seconds */
> >{'execute':'guest-fsfreeze-thaw'}
> >{"error": {"class": "GenericError", "desc": "couldn't hold writes:
> >fsfreeze is limited up to 10 seconds: (error: 80042314)"}}
> >{'execute':'guest-fsfreeze-freeze'}
> >{"return": 2}
> >/* wait 60+ seconds */
> >{'execute':'guest-fsfreeze-thaw'}
> >{"error": {"class": "GenericError", "desc": "failed to do snapshot set:
> >(error: 8004230f)"}}
> >
> >It this seems to be because CommitSnapshot returns in the latter instance
> >due
> >to VSS_TIMEOUT_MSEC wait and it's E_ABORT error message overwrites the
> >VSS_E_HOLD_WRITES_TIMEOUT from earlier. Perhaps we could just have
> >CommitSnapshot return VSS_E_HOLD_WRITES_TIMEOUT if it doesn't get the thaw
> >event in time? I think that error message is much more informative for
> >users.
>
> Agreed.
>
> How about modifying the provider to return S_OK instead of E_ABORT
> when it exceeds VSS_TIMEOUT_MSEC?
>
> As VSS_TIMEOUT_MSEC is larger than 10 seconds fsfreeze timeout,
> CommitSnapshot will return VSS_E_HOLD_WRITES_TIMEOUT on provider's timeout,
> and give "fsfreeze is limited up to 10 seconds" message to users.
> The writers are also automatically thawed by 60 seconds timeout anyway,
>
> this wouldn't break anything.
Hmm, it seems like it would work, but I'm a bit worried about returning
S_OK and hoping VSS still produces an error. Probably really unlikely, but
I could imagine a really strange timing issue where the 60 second timeout
for whatever reason gets triggered before the 10 second one (maybe because
one vcpu is being starved for whatever reason, don't know enough about
windows timekeeping mechanisms to know how plausible this is, but I'd
rather not rely on it not happening).
One way you could maybe do it would be maybe introducing a hEventTimeout
that CommitSnapshots will set if it times out. That way we can poll
it to see if the event is set when we get VSS_E_UNEXPECTED_PROVIDER_ERROR
in the requester. If it's set we know a timeout occurred on provider side,
and return the E_VSS_HOLD_WRITES_TIMEOUT error.
>
> The waiting code in CQGAVssProvider::CommitSnapshots would be like this:
>
> ...
> /* Wait until the snapshot is taken by the host. */
> wait = WaitForSingleObject(hEventThaw, VSS_TIMEOUT_MSEC);
> switch (wait) {
> case WAIT_TIMEOUT:
> /*
> * We return S_OK, but the qemu-ga will get
> E_VSS_HOLD_WRITES_TIMEOUT
> * because 10 seconds fsfreeze timeout is exceeded. (If we return
> some
> * error here, it results in VSS_E_UNEXPECTED_PROVIDER_ERROR that
> can
> * be caused by the other reasons.)
> */
> case WAIT_OBJECT_0:
> hr = S_OK;
> break;
>
> default:
> hr = E_ABORT;
> }
> ...
>
>
> Thanks,
> Tomoki Sekiyama
next prev parent reply other threads:[~2013-07-30 22:25 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-23 22:45 [Qemu-devel] [PATCH v8 00/10] qemu-ga: fsfreeze on Windows using VSS Tomoki Sekiyama
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 01/10] configure: Support configuring C++ compiler Tomoki Sekiyama
2013-07-25 23:07 ` Michael Roth
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 02/10] Add c++ keywords to QAPI helper script Tomoki Sekiyama
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 03/10] checkpatch.pl: Check .cpp files Tomoki Sekiyama
2013-07-25 23:16 ` Michael Roth
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 04/10] Add a script to extract VSS SDK headers on POSIX system Tomoki Sekiyama
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 05/10] qemu-ga: Add configure options to specify path to Windows/VSS SDK Tomoki Sekiyama
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 06/10] error: Add error_set_win32 and error_setg_win32 Tomoki Sekiyama
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 07/10] qemu-ga: Add Windows VSS provider and requester as DLL Tomoki Sekiyama
2013-07-25 23:36 ` Michael Roth
2013-07-27 1:22 ` Michael Roth
2013-07-29 18:32 ` Tomoki Sekiyama
2013-07-30 19:35 ` Michael Roth
2013-07-30 20:54 ` Tomoki Sekiyama
2013-07-30 22:24 ` Michael Roth [this message]
2013-07-30 22:28 ` Tomoki Sekiyama
2013-07-23 22:45 ` [Qemu-devel] [PATCH v8 08/10] qemu-ga: Call Windows VSS requester in fsfreeze command handler Tomoki Sekiyama
2013-07-30 20:17 ` Michael Roth
2013-07-23 22:46 ` [Qemu-devel] [PATCH v8 09/10] qemu-ga: Install Windows VSS provider on `qemu-ga -s install' Tomoki Sekiyama
2013-07-30 20:20 ` Michael Roth
2013-07-23 22:46 ` [Qemu-devel] [PATCH v8 10/10] QMP/qemu-ga-client: Make timeout longer for guest-fsfreeze-freeze command Tomoki Sekiyama
2013-07-30 20:21 ` Michael Roth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130730222454.14585.82193@loki \
--to=mdroth@linux.vnet.ibm.com \
--cc=areis@redhat.com \
--cc=ghammer@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=lersek@redhat.com \
--cc=libaiqing@huawei.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=seiji.aguchi@hds.com \
--cc=stefanha@gmail.com \
--cc=tomoki.sekiyama@hds.com \
--cc=vrozenfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).