From: Stefan Hajnoczi <stefanha@gmail.com>
To: Andrew Martin <amartin@xes-inc.com>
Cc: qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later
Date: Thu, 28 Aug 2014 11:22:09 +0100 [thread overview]
Message-ID: <20140828102209.GC26741@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <280510069.69184.1408990389981.JavaMail.zimbra@xes-inc.com>
[-- Attachment #1: Type: text/plain, Size: 2626 bytes --]
On Mon, Aug 25, 2014 at 01:13:09PM -0500, Andrew Martin wrote:
> > >> > I recently experienced UPS failure on several hosts which caused a hard
> > >> > shutdown. After restarting, 3 of the guests had corruption on their
> > >> > disks
> > >> > and
> > >> > required a fairly long fsck to fix. Afterwards, data that had been
> > >> > written
> > >> > to
> > >> > the disks several hours before the crash was corrupted, which makes me
> > >> > think
> > >> > that it was never fsync()-ed to the non-volatile storage.
> > >>
> > >> What exactly was the "corruption" you encountered? Which application,
> > >> error message, etc.
> > >
> > > Two of the servers are web servers with apache2. In one case, a python
> > > daemon
> > > copies JPGs onto the server - the last 100 copied onto the server were
> > > corrupted.
> > > In another case, some files had been uploaded several days prior to the
> > > www-root,
> > > but after the hard reset said files were no longer present in the
> > > filesystem.
> >
> > Did the Python daemon fsync the files and directories it modified/created?
> >
> > Did you sync(1) after copying files to www-root?
> >
> > Also, you didn't explain what "corrupted" means. Where the jpg files
> > missing, were they zero bytes in size, were they filled with junk,
> > etc?
> >
> The jpgs appeared to be a normal size, but were filled with junk. The files
> uploaded by apache2 were missing from the filesystem.
>
> Even if the python daemon or apache2 did not fsync the modified files, isn't
> there some action that the OS takes periodically to flush dirty pages to disk?
> This seems to be implied in the SuSE documentation:
> https://www.suse.com/documentation/sles11/book_kvm/data/sect1_1_chapter_book_kvm.html
> "the normal page cache management will handle commitment to the storage device."
>
>
> In the case of the files uploaded by apache2, they were added to the server days
> before the power outage, so it seems like there would have been ample time for
> those changes to have been flushed.
In the general case of copying/creating some files and hoping that they
will be persistent, it usually works. If you want to be 100% sure you
still need to flush the cache explicitly.
It doesn't work when updates are made to data on disk and the ordering
matters (e.g. wrong ordering could corrupt data or cause it to be lost).
In that case relying on the kernel to flush dirty buffers periodically
is not a feasible approach because you don't know when the will happen
and therefore have no control over ordering.
Stefan
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
prev parent reply other threads:[~2014-08-28 10:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1009168463.49610.1408133034828.JavaMail.zimbra@xes-inc.com>
2014-08-15 20:18 ` [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later Andrew Martin
2014-08-19 14:59 ` Stefan Hajnoczi
2014-08-19 23:20 ` Andrew Martin
2014-08-21 12:59 ` Stefan Hajnoczi
2014-08-25 18:13 ` Andrew Martin
2014-08-26 7:03 ` Paolo Bonzini
2014-08-27 14:23 ` Andrew Martin
2014-08-27 14:34 ` Paolo Bonzini
2014-08-27 18:46 ` Andrew Martin
2014-08-27 20:47 ` Paolo Bonzini
2014-08-28 10:22 ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140828102209.GC26741@stefanha-thinkpad.redhat.com \
--to=stefanha@gmail.com \
--cc=amartin@xes-inc.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).