From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60668) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a1404-0008SU-D3 for qemu-devel@nongnu.org; Mon, 23 Nov 2015 22:10:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a1400-0005if-Uk for qemu-devel@nongnu.org; Mon, 23 Nov 2015 22:10:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50705) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a1400-0005iN-Iq for qemu-devel@nongnu.org; Mon, 23 Nov 2015 22:10:32 -0500 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id D2C6C8E362 for ; Tue, 24 Nov 2015 03:10:31 +0000 (UTC) Date: Tue, 24 Nov 2015 11:10:27 +0800 From: Fam Zheng Message-ID: <20151124031027.GC26733@ad.usersys.redhat.com> References: <1448273262-13845-1-git-send-email-peterx@redhat.com> <56533D45.1060108@redhat.com> <20151123175759.GG3606@hawk.localdomain> <5653C422.3040307@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5653C422.3040307@redhat.com> Subject: Re: [Qemu-devel] [PATCH REPOST 0/2] Add basic "detach" support for dump-guest-memory List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: Andrew Jones , qemu-devel@nongnu.org, armbru@redhat.com, lcapitulino@redhat.com, pbonzini@redhat.com, Laszlo Ersek On Tue, 11/24 09:57, Peter Xu wrote: > On 11/24/2015 01:57 AM, Andrew Jones wrote: > > On Mon, Nov 23, 2015 at 05:22:29PM +0100, Laszlo Ersek wrote: > >> On 11/23/15 11:07, Peter Xu wrote: > >>> Currently, dump-guest-memory supports synchronous operation only. This patch > >>> sets are adding "detach" support for it (just like "migrate -d" for > >>> migration). When "-d" is provided, dump-guest-memory command will return > >>> immediately without hanging user. This should be useful when the backend > >>> storage for the dump file is very slow. > >>> > >>> Peter Xu (2): > >>> dump-guest-memory: add "detach" flag for QMP/HMP interfaces > >>> dump-guest-memory: add basic "detach" support. > >>> > >>> dump.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++----- > >>> hmp-commands.hx | 5 +++-- > >>> hmp.c | 3 ++- > >>> include/sysemu/dump.h | 4 ++++ > >>> qapi-schema.json | 3 ++- > >>> qmp-commands.hx | 4 ++-- > >>> qmp.c | 9 ++++++++ > >>> vl.c | 3 +++ > >>> 8 files changed, 81 insertions(+), 12 deletions(-) > >>> > >> > >> I'm not seeing anything that would prevent races between the new thread > >> and any other existing threads that manipulate the MemoryRegion objects > >> (in response to guest actions), or the guest RAM contents (by way of > >> executing guest code). > >> > >> The dump_init() function has > >> > >> if (runstate_is_running()) { > >> vm_stop(RUN_STATE_SAVE_VM); > >> s->resume = true; > >> } else { > >> s->resume = false; > >> } > >> > >> Whereas dump_cleanup() has: > >> > >> if (s->resume) { > >> vm_start(); > >> } > >> > >> If you return control to the QEMU monitor's user before the dump > >> completes, they could issue the "cont" command, and unleash the VCPU > >> threads again. (Of course, this is just one example where things could > >> go wrong.) > > Yes, I added the global flag "dump_in_progress_p" to do this. For > now, what I found might be affected was "dump-guest-memory" itself, > and "cont". Please check patch 2/2 modification for qmp_cont(). I > failed to find any other place that might be influenced by this > asynchronous operation (you are right, maybe it still exists, and it > might introduce extra bugs, actually that's what I was looking for > to see whether I missed something in the review session). What about all the hot-plug commands that changes the memory layout? Another question is what if user issued "stop" during dump, should you still resume when dump completes? > > >> > >> Also, the live migration analogy is not a good one IMO. For live > >> migration, a whole infrastructure exists for tracking asynchronous guest > >> state changes (dirty bitmap, locking, whatever). > >> > >> The good analogy with live migration would be continuous streaming of > >> guest memory changes into the dump file, until it converges, or a cutoff > >> is reached (at which point the guest would be frozen, same as now). Of > >> course, such streaming could generate huge amounts of traffic and > >> entirely defeat the original purpose. > > Yes, I see that migration is much more complex scenario, so that's > why I am trying to add "basic detach" support, just as I mentioned > in the patch title. :) > > Before doing anything like that complex, I will send a mail asking > about it, to first know whether we need to do that. > > >> > >> In total, I don't think this is a good idea. I find it possible that > >> this would lead to QEMU crashes, and/or wildly inconsistent guest memory > >> images. > > > > Despite having already run through both patches giving review comments, > > I agree with Laszlo. At first blush it seems like a good idea, but I > > can't think of how it would be safe. Also, an admin can always background > > the task that invokes the dump if they need that particular terminal > > back. So, this looks more like a management tool problem to solve, if > > anything. > > > >> > >> As for the goal itself... People also tend to cope with *kdump* being > >> slow on physical machines. > >> > >> My recommendation would be to use the dump compression feature > >> (especially lzo and snappy). In my experience, even when people are > >> aware of their existence, they don't always realize that shrinking the > >> dump file size by a given factor also shrinks the dumping *time* by the > >> same factor, provided that the dumping process remains IO-bound even in > >> the compressed case. > >> > >> Which it should, assuming a "very slow storage" -- lzo and snappy are > >> very CPU-efficient. > > > > This has been my experience, i.e. using lzo or snappy tends to be much, > > much faster. > > Sorry that I am not the daily user of dump-guest-memory, so I may > have not tried to compare how time would save when compression > techniques are used. Thanks (Drew & Laszlo) to let me know this. > Actually, what I am coping with is the bz: > > https://bugzilla.redhat.com/show_bug.cgi?id=1193826 I don't think this mode is very helpful to fix this issue unless there is a way to query the dump progress. > > I just feel like it would be nice to offer something extra, when > people are using the stdio monitor, they could have another choice > when dump. Also, this is my first patch to QEMU. That's all I > thought about. > > Thanks you all (especially Drew and Laszlo) for leaving mass review > comments. After knowing that more than one of you would suggest not > taking the risk comparing to the feature it brings, I'd totally > agree to drop this patch. > > Thanks. > Peter > > > > > Thanks, > > drew > > >