From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41675) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XRhid-0004fn-Kl for qemu-devel@nongnu.org; Wed, 10 Sep 2014 09:14:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XRhiV-0006Kh-Jh for qemu-devel@nongnu.org; Wed, 10 Sep 2014 09:13:55 -0400 Received: from lputeaux-656-01-25-125.w80-12.abo.wanadoo.fr ([80.12.84.125]:41939 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XRhiV-0006KZ-9C for qemu-devel@nongnu.org; Wed, 10 Sep 2014 09:13:47 -0400 Date: Wed, 10 Sep 2014 15:12:52 +0200 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20140910131252.GA12470@irqsave.net> References: <1409557394-11853-1-git-send-email-namei.unix@gmail.com> <20140907151231.GA25961@irqsave.net> <20140910071822.GA18858@ubuntu-trusty> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20140910071822.GA18858@ubuntu-trusty> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 0/8] add basic recovery logic to quorum driver List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liu Yuan Cc: =?iso-8859-1?Q?Beno=EEt?= Canet , Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi The Wednesday 10 Sep 2014 =E0 15:18:22 (+0800), Liu Yuan wrote : > On Sun, Sep 07, 2014 at 05:12:31PM +0200, Beno=EEt Canet wrote: > > The Monday 01 Sep 2014 =E0 15:43:06 (+0800), Liu Yuan wrote : > > > This patch set mainly add mainly two logics to implement device rec= over > > > - notify qourum driver of the broken states from the child driver(s= ) > > > - dirty track and sync the device after it is repaired > > >=20 > > > Thus quorum allow VMs to continue while some child devices are brok= en and when > > > the child devices are repaired and return back, we sync dirty bits = during > > > downtime to keep data consistency. > > >=20 > > > The recovery logic is based on the driver state bitmap and will syn= c the dirty > > > bits with a timeslice window in a coroutine in this prtimive implem= entation. > > >=20 > > > Simple graph about 2 children with threshold=3D1 and read-pattern=3D= fifo: > > > (similary to DRBD) > > >=20 > > > + denote device sync iteration > > > - IO on a single device > > > =3D IO on two devices > > >=20 > > > sync complete, release dirty = bitmap > > > ^ > > > | > > > =3D=3D=3D=3D-----------------++++----++++----++=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > > > | | > > > | v > > > | device repaired and begin to sync > > > v > > > device broken, create a dirty bitmap > > >=20 > > > This sync logic can take care of nested broken problem, that devi= ces are > > > broken while in sync. We just start a sync process after the devi= ces are > > > repaired again and switch the devices from broken to sound only w= hen the sync > > > completes. > > >=20 > > > For read-pattern=3Dquorum mode, it enjoys the recovery logic withou= t any problem. > > >=20 > > > Todo: > > > - use aio interface to sync data (multiple transfer in one go) > > > - dynamic slice window to control sync bandwidth more smoothly > > > - add auto-reconnection mechanism to other protocol (if not support= yet) > > > - add tests > > >=20 > > > Cc: Eric Blake > > > Cc: Benoit Canet > > > Cc: Kevin Wolf > > > Cc: Stefan Hajnoczi > > >=20 > > > Liu Yuan (8): > > > block/quorum: initialize qcrs.aiocb for read > > > block: add driver operation callbacks > > > block/sheepdog: propagate disconnect/reconnect events to upper dr= iver > > > block/quorum: add quorum_aio_release() helper > > > quorum: fix quorum_aio_cancel() > > > block/quorum: add broken state to BlockDriverState > > > block: add two helpers > > > quorum: add basic device recovery logic > > >=20 > > > block.c | 17 +++ > > > block/quorum.c | 324 ++++++++++++++++++++++++++++++++++= +++++++----- > > > block/sheepdog.c | 9 ++ > > > include/block/block.h | 9 ++ > > > include/block/block_int.h | 6 + > > > trace-events | 5 + > > > 6 files changed, 336 insertions(+), 34 deletions(-) > > >=20 > > > --=20 > > > 1.9.1 > > >=20 > >=20 > > Hi liu, > >=20 > > Had you noticed that your series conflict with one of Fam's series in= the quorum cancel > > function fix patch ? >=20 > Not yet, thanks for reminding. I think Fam somehow digested you patch. >=20 > > Could you find an arrangement with Fam so the two patches don't colli= de anymore ? > >=20 > > Do you intend to respin your series ? >=20 > Yes, I'll rebase the v2 later before more possible reviews. >=20 > Thanks > Yuan >=20