From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
qemu-devel@nongnu.org, agl@us.ibm.com
Subject: Re: [Qemu-devel] [PATCH 1/2] Add virtagent file system freeze/thaw
Date: Thu, 03 Feb 2011 11:41:43 -0600 [thread overview]
Message-ID: <4D4AE8D7.3060209@linux.vnet.ibm.com> (raw)
In-Reply-To: <4D491A5A.5020008@redhat.com>
On 02/02/2011 02:48 AM, Jes Sorensen wrote:
> On 02/02/11 08:57, Stefan Hajnoczi wrote:
>> On Tue, Feb 1, 2011 at 10:58 AM,<Jes.Sorensen@redhat.com> wrote:
>>> From: Jes Sorensen<Jes.Sorensen@redhat.com>
>>>
>>> Implement freeze/thaw support in the guest, allowing the host to
>>> request the guest freezes all it's file systems before a live snapshot
>>> is performed.
>>> - fsfreeze(): Walk the list of mounted local real file systems,
>>> and freeze them.
>>
>> Does this add a requirement that guest agent code issues no disk I/O
>> in its main loop (e.g. logging)? Otherwise we might deadlock
>> ourselves waiting for I/O which is never issued.
>
> Yes very much so[1] - one reason why it would be nice to have virtagent
> use threads to execute the actual commands. We should probably add a
> flag to agent commands indicating whether they issue disk I/O or not, so
> we can block attempts to execute commands that do so, while the guest is
> frozen.
**Warning, epic response**
For things like logging and i/o on a frozen system...I agree we'd need
some flag for these kinds of situations. Maybe a disable_logging()
flag....i really don't like this though... I'd imagine even syslogd()
could block virtagent in this type of situation, so that would need to
be disabled as well.
But doing so completely subverts our attempts and providing proper
accounting of what the agent is doing to the user. A user can freeze the
filesystem, knowing that logging would be disabled, then prod at
whatever he wants. So the handling should be something specific to
fsfreeze, with stricter requirements:
If a user calls fsfreeze(), we disable logging, but also disable the
ability to do anything other than fsthaw() or fsstatus(). This actually
solves the potential deadlocking problem for other RPCs as well...since
they cant be executed in the first place.
So I think that addresses the agent deadlocking itself, post-freeze.
However, fsfreeze() itself might lock-up the agent as well...I'm not
confident we can really put any kind of bound on how long it'll take to
execute, and if we timeout on the client-side the agent can still block
here.
Plus there are any number of other situations where an RPC can still
hang things...in the future when we potentially allow things like script
execution, they might do something like attempt to connect to a socket
that's already in use and wait on the server for an arbitrary amount of
time, or open a file on an nfs share that in currently unresponsive.
So a solution for these situations is still needed, and I'm starting to
agree that threads are needed, but I don't think we should do RPCs
concurrently (not sure if that's what is being suggested or not). At
least, there's no pressing reason for it as things currently stand
(there aren't currently any RPCs where fast response times are all that
important, so it's okay to serialize them behind previous RPCs, and
HMP/QMP are command at a time), and it's something that Im fairly
confident can be added if the need arises in the future.
But for dealing with a situation where an RPC can hang the agent, I
think one thread should do it. Basically:
We associate each RPC with a time limit. Some RPCs, very special ones
that we'd trust with our kids, could potentially specify an unlimited
timeout. The client side should use this same timeout on it's end. In
the future we might allow the user to explicitly disable the timeout for
a certain RPC. The logic would then be:
- read in a client RPC request
- start a thread to do RPC
- if there's a timeout, register an alarm(<timeout>), with a handler
that will call something like pthread_kill(current_worker_thread). On
the thread side, this signal will induce a pthread_exit()
- wait for the thread to return (pthread_join(current_worker_thread))
- return it's response back to the caller if it finished, return a
timeout indication otherwise
>
> Cheers,
> Jes
>
> [1] speaking from experience ... a Linux desktop gets really upset if
> you freeze the file systems from a command in an xterm.... ho hum
next prev parent reply other threads:[~2011-02-03 17:41 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-01 10:58 [Qemu-devel] [PATCH 0/2] virtagent - fsfreeze support Jes.Sorensen
2011-02-01 10:58 ` [Qemu-devel] [PATCH 1/2] Add virtagent file system freeze/thaw Jes.Sorensen
2011-02-01 14:12 ` Stefan Hajnoczi
2011-02-01 14:26 ` Jes Sorensen
2011-02-01 14:34 ` Stefan Hajnoczi
2011-02-01 14:36 ` Jes Sorensen
2011-02-01 14:41 ` Stefan Hajnoczi
2011-02-01 17:22 ` Michael Roth
2011-02-01 14:48 ` [Qemu-devel] " Adam Litke
2011-02-01 15:02 ` Jes Sorensen
2011-02-01 16:50 ` Michael Roth
2011-02-02 8:38 ` Jes Sorensen
2011-02-02 7:57 ` [Qemu-devel] " Stefan Hajnoczi
2011-02-02 8:48 ` Jes Sorensen
2011-02-03 17:41 ` Michael Roth [this message]
2011-02-04 6:13 ` Stefan Hajnoczi
2011-02-04 16:27 ` Michael Roth
2011-02-04 16:52 ` Stefan Hajnoczi
2011-02-04 11:03 ` Jes Sorensen
2011-02-04 16:51 ` Michael Roth
2011-02-01 10:58 ` [Qemu-devel] [PATCH 2/2] Add monitor commands for fsfreeze support Jes.Sorensen
2011-02-01 11:25 ` [Qemu-devel] [PATCH 0/2] virtagent - " Vasiliy G Tolstov
2011-02-01 13:02 ` Jes Sorensen
2011-02-01 16:04 ` Richard W.M. Jones
2011-02-01 20:04 ` Vasiliy G Tolstov
2011-02-01 20:17 ` Richard W.M. Jones
2011-02-01 14:16 ` Stefan Hajnoczi
2011-02-01 14:28 ` Jes Sorensen
-- strict thread matches above, loose matches on Subject: below --
2011-02-02 8:42 [Qemu-devel] [PATCH v2 " Jes.Sorensen
2011-02-02 8:42 ` [Qemu-devel] [PATCH 1/2] Add virtagent file system freeze/thaw Jes.Sorensen
2011-02-04 10:57 [Qemu-devel] [PATCH v3 0/2] virtagent - fsfreeze support Jes.Sorensen
2011-02-04 10:57 ` [Qemu-devel] [PATCH 1/2] Add virtagent file system freeze/thaw Jes.Sorensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D4AE8D7.3060209@linux.vnet.ibm.com \
--to=mdroth@linux.vnet.ibm.com \
--cc=Jes.Sorensen@redhat.com \
--cc=agl@us.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).