From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58227) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c9Dbm-0004UN-8S for qemu-devel@nongnu.org; Tue, 22 Nov 2016 11:07:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c9Dbh-0007iT-AD for qemu-devel@nongnu.org; Tue, 22 Nov 2016 11:07:46 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47334) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c9Dbh-0007hz-4Q for qemu-devel@nongnu.org; Tue, 22 Nov 2016 11:07:41 -0500 References: <2fb12281-1023-71c0-7fd9-39e27787c1e9@virtuozzo.com> From: John Snow Message-ID: <6602e519-1d25-86be-855e-d29155ec267c@redhat.com> Date: Tue, 22 Nov 2016 11:07:38 -0500 MIME-Version: 1.0 In-Reply-To: <2fb12281-1023-71c0-7fd9-39e27787c1e9@virtuozzo.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] dirty bitmap state uncertainty under certain conditions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikolay Shirokovskiy , qemu-devel@nongnu.org Cc: Denis Lunev , Vladimir Sementsov-Ogievskiy , Maxim Nestratov , Eric Blake , Jeff Cody On 11/22/2016 07:01 AM, Nikolay Shirokovskiy wrote: > Hi, everyone. > > There is a problem with current incremental backups. Imagine I ask qemu to > make an incremental backup then go away and return back when backup > job is finished. Qemu process dismisses the job completely and I missed > all the events so I don't know the result of the operation and what is > most important I don't know the base for dirty bitmap now. In case of failure > it is previous backup and in case of success it is the last backup. Qemu does > not track dirty bitmap base for me so I have no choice other then clear > dirty bitmap and make full backup which would be rather unexpected from user > POV (The situation of going away/coming back is libvirt crash/restart of course.) > Why was the completion/failure event missed? Is there some reason why you cannot guarantee that you will observe the completion? > I guess problem has wider scope. In case I miss successfull completion of full > backup my only option is to drop backup file and redo the backup completely > which is rather wasteful. AFAIU I can not query backup completion result from > backup file itself. I guess there can be similar issues for other qemu jobs. > > Nikolay > I would personally advocate for a job-neutral solution where jobs can be given a parameter such that the job persists in memory in a new "completed" state until such time that it is queried explicitly, then it can be dropped. I am not sure if we can make this the default behavior, as it might confuse libvirt to occasionally see jobs that have already completed. Talking to Kevin off-list, he suggested that we might be able to make this the default behavior if we pivot to the new jobs API that I have been proposing, accompanied by a new explicit command to put a command to rest. I can work on this for 2.9; though we may still need a "temporary" solution for the old jobs API until we're ready to officially deprecate the older interface.