From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40153) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOwQp-0000YI-HC for qemu-devel@nongnu.org; Tue, 02 Sep 2014 18:20:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XOwQk-00086o-Bc for qemu-devel@nongnu.org; Tue, 02 Sep 2014 18:20:07 -0400 Received: from lputeaux-656-01-25-125.w80-12.abo.wanadoo.fr ([80.12.84.125]:48639 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOwQk-00085B-4z for qemu-devel@nongnu.org; Tue, 02 Sep 2014 18:20:02 -0400 Date: Wed, 3 Sep 2014 00:19:14 +0200 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20140902221914.GA24069@irqsave.net> References: <1409557394-11853-1-git-send-email-namei.unix@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1409557394-11853-1-git-send-email-namei.unix@gmail.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 0/8] add basic recovery logic to quorum driver List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liu Yuan Cc: Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi The Monday 01 Sep 2014 =E0 15:43:06 (+0800), Liu Yuan wrote : Liu, Do you think this could work with qcow2 file backed by NFS servers ? Best regards Beno=EEt > This patch set mainly add mainly two logics to implement device recover > - notify qourum driver of the broken states from the child driver(s) > - dirty track and sync the device after it is repaired >=20 > Thus quorum allow VMs to continue while some child devices are broken a= nd when > the child devices are repaired and return back, we sync dirty bits duri= ng > downtime to keep data consistency. >=20 > The recovery logic is based on the driver state bitmap and will sync th= e dirty > bits with a timeslice window in a coroutine in this prtimive implementa= tion. >=20 > Simple graph about 2 children with threshold=3D1 and read-pattern=3Dfif= o: > (similary to DRBD) >=20 > + denote device sync iteration > - IO on a single device > =3D IO on two devices >=20 > sync complete, release dirty bitm= ap > ^ > | > =3D=3D=3D=3D-----------------++++----++++----++=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D > | | > | v > | device repaired and begin to sync > v > device broken, create a dirty bitmap >=20 > This sync logic can take care of nested broken problem, that devices = are > broken while in sync. We just start a sync process after the devices = are > repaired again and switch the devices from broken to sound only when = the sync > completes. >=20 > For read-pattern=3Dquorum mode, it enjoys the recovery logic without an= y problem. >=20 > Todo: > - use aio interface to sync data (multiple transfer in one go) > - dynamic slice window to control sync bandwidth more smoothly > - add auto-reconnection mechanism to other protocol (if not support yet= ) > - add tests >=20 > Cc: Eric Blake > Cc: Benoit Canet > Cc: Kevin Wolf > Cc: Stefan Hajnoczi >=20 > Liu Yuan (8): > block/quorum: initialize qcrs.aiocb for read > block: add driver operation callbacks > block/sheepdog: propagate disconnect/reconnect events to upper driver > block/quorum: add quorum_aio_release() helper > quorum: fix quorum_aio_cancel() > block/quorum: add broken state to BlockDriverState > block: add two helpers > quorum: add basic device recovery logic >=20 > block.c | 17 +++ > block/quorum.c | 324 ++++++++++++++++++++++++++++++++++++++= +++----- > block/sheepdog.c | 9 ++ > include/block/block.h | 9 ++ > include/block/block_int.h | 6 + > trace-events | 5 + > 6 files changed, 336 insertions(+), 34 deletions(-) >=20 > --=20 > 1.9.1 >=20