From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:45587)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1XMwqR-0000Vo-Cj
	for qemu-devel@nongnu.org; Thu, 28 Aug 2014 06:22:25 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1XMwqL-0005vB-NQ
	for qemu-devel@nongnu.org; Thu, 28 Aug 2014 06:22:19 -0400
Received: from mail-we0-x22f.google.com ([2a00:1450:400c:c03::22f]:63663)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1XMwqL-0005v1-Ct
	for qemu-devel@nongnu.org; Thu, 28 Aug 2014 06:22:13 -0400
Received: by mail-we0-f175.google.com with SMTP id k48so529799wev.6
	for <qemu-devel@nongnu.org>; Thu, 28 Aug 2014 03:22:12 -0700 (PDT)
Date: Thu, 28 Aug 2014 11:22:09 +0100
From: Stefan Hajnoczi <stefanha@gmail.com>
Message-ID: <20140828102209.GC26741@stefanha-thinkpad.redhat.com>
References: <1009168463.49610.1408133034828.JavaMail.zimbra@xes-inc.com>
	<985931631.51123.1408133895894.JavaMail.zimbra@xes-inc.com>
	<20140819145925.GB13680@stefanha-thinkpad.redhat.com>
	<838926932.102908.1408490438455.JavaMail.zimbra@xes-inc.com>
	<CAJSP0QVuK7R0Dx5cXLZMRxaiopBwAk_VxLG9Z_RH4VwPBO8ZWg@mail.gmail.com>
	<280510069.69184.1408990389981.JavaMail.zimbra@xes-inc.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="Md/poaVZ8hnGTzuv"
Content-Disposition: inline
In-Reply-To: <280510069.69184.1408990389981.JavaMail.zimbra@xes-inc.com>
Subject: Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and
	later
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Andrew Martin <amartin@xes-inc.com>
Cc: qemu-devel <qemu-devel@nongnu.org>


--Md/poaVZ8hnGTzuv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Aug 25, 2014 at 01:13:09PM -0500, Andrew Martin wrote:
> > >> > I recently experienced UPS failure on several hosts which caused a=
 hard
> > >> > shutdown. After restarting, 3 of the guests had corruption on their
> > >> > disks
> > >> > and
> > >> > required a fairly long fsck to fix. Afterwards, data that had been
> > >> > written
> > >> > to
> > >> > the disks several hours before the crash was corrupted, which make=
s me
> > >> > think
> > >> > that it was never fsync()-ed to the non-volatile storage.
> > >>
> > >> What exactly was the "corruption" you encountered?  Which applicatio=
n,
> > >> error message, etc.
> > >
> > > Two of the servers are web servers with apache2. In one case, a python
> > > daemon
> > > copies JPGs onto the server - the last 100 copied onto the server were
> > > corrupted.
> > > In another case, some files had been uploaded several days prior to t=
he
> > > www-root,
> > > but after the hard reset said files were no longer present in the
> > > filesystem.
> >=20
> > Did the Python daemon fsync the files and directories it modified/creat=
ed?
> >=20
> > Did you sync(1) after copying files to www-root?
> >=20
> > Also, you didn't explain what "corrupted" means.  Where the jpg files
> > missing, were they zero bytes in size, were they filled with junk,
> > etc?
> >=20
> The jpgs appeared to be a normal size, but were filled with junk. The fil=
es
> uploaded by apache2 were missing from the filesystem.
>=20
> Even if the python daemon or apache2 did not fsync the modified files, is=
n't=20
> there some action that the OS takes periodically to flush dirty pages to =
disk?=20
> This seems to be implied in the SuSE documentation:
> https://www.suse.com/documentation/sles11/book_kvm/data/sect1_1_chapter_b=
ook_kvm.html
> "the normal page cache management will handle commitment to the storage d=
evice."
>=20
>=20
> In the case of the files uploaded by apache2, they were added to the serv=
er days=20
> before  the power outage, so it seems like there would have been ample ti=
me for=20
> those changes to have been flushed.

In the general case of copying/creating some files and hoping that they
will be persistent, it usually works.  If you want to be 100% sure you
still need to flush the cache explicitly.

It doesn't work when updates are made to data on disk and the ordering
matters (e.g. wrong ordering could corrupt data or cause it to be lost).
In that case relying on the kernel to flush dirty buffers periodically
is not a feasible approach because you don't know when the will happen
and therefore have no control over ordering.

Stefan

--Md/poaVZ8hnGTzuv
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJT/wLRAAoJEJykq7OBq3PIXRwH/j6NucUrEmsa7P1vP19/mKyF
/4W0lF8Cnh48SgSrgMH9zcGZ/g/z00fsASaTT6+/JTiWR8FVK4pTjDycQgK4FtKy
8oYyeaCJoR/pXdKAybEoC2Kxyi69y/Yv728WDIz75mFZlR3W0WrRnE3n4cCrIVH6
7W0hIrYuXL5Zp9dGOUBG6lQ51hiKcXKwJ20xG+QnQ4egW7tOMZKGWl6s2uqK/+A+
BSYKA8VLfEnxCFn0Bf54JRTbdnS4Mpc+cigMEZv2/HyHSudjPrSYnFax02TFUW+W
wyAw4iuY72jH4dSfzecRGt8Dhq6CU5Vq9pVxF7OOl+xcww/I9fwUdc5VwApV3lI=
=oQxy
-----END PGP SIGNATURE-----

--Md/poaVZ8hnGTzuv--