From: Douglas McClendon <dmc.fedora@filteredperception.org>
To: device-mapper development <dm-devel@redhat.com>
Subject: Re: possible snapshot corruption after system crash with bad timing
Date: Mon, 14 Jan 2008 20:29:58 -0600 [thread overview]
Message-ID: <478C1AA6.2000500@filteredperception.org> (raw)
In-Reply-To: <20080105224105.GA1122@debian-etc-mailname>
Lars Ellenberg wrote:
> hi there,
>
> I suspect that when using persistent snapshots,
> there is the (off?) chance for the snapshot to be corrupted
> after a system crash.
I hope someone can answer this, as I use persistent snapshots heavily
for LiveUSB persistence (rootfs on a persistent snapshot (on usbflash))
Currently things seem to be working fairly well for me, wheras about 6
months ago I would see corruption if I intentionally crashed the system
to see how it would respond.
Still, this post worries me. I hope someone else answers it.
-dmc
>
> scenario described below.
>
> Am I missing somthing?
>
> Cheers,
> Lars
>
> from current dm-exception-store.c:
>
> static void persistent_commit(struct exception_store *store,
> struct dm_snap_exception *e,
> void (*callback) (void *, int success),
> void *callback_context)
> {
> int r;
> unsigned int i;
> struct pstore *ps = get_info(store);
> struct disk_exception de;
> struct commit_callback *cb;
>
> de.old_chunk = e->old_chunk;
> de.new_chunk = e->new_chunk;
> write_exception(ps, ps->current_committed++, &de);
>
> /*
> * Add the callback to the back of the array. This code
> * is the only place where the callback array is
> * manipulated, and we know that it will never be called
> * multiple times concurrently.
> */
> cb = ps->callbacks + ps->callback_count++;
> cb->callback = callback;
> cb->context = callback_context;
>
> /*
> * If there are no more exceptions in flight, or we have
> * filled this metadata area we commit the exceptions to
> * disk.
> */
> if (atomic_dec_and_test(&ps->pending_count) ||
> (ps->current_committed == ps->exceptions_per_area)) {
> r = area_io(ps, ps->current_area, WRITE);
> if (r)
> ps->valid = 0;
>
>
>
> ==== POSSIBLE CURRUPTION RIGHT HERE ========
>
> assume we crash right after we completely filled the current area,
> but before we had a chance to zero out the next area.
>
> further assume there has been old data garbage in the
> area mapped for this exception store.
>
> after reboot, the snapshot will be reassembled,
> but may not find the necessary zeros to terminate the
> read_exceptions loop in insert_exceptions,
> thus may dm_add_exception bogus mappings,
> aka data corruption.
>
> ===========================================
>
>
> /*
> * Have we completely filled the current area ?
> */
> if (ps->current_committed == ps->exceptions_per_area) {
> ps->current_committed = 0;
> r = zero_area(ps, ps->current_area + 1);
> if (r)
> ps->valid = 0;
> }
>
> for (i = 0; i < ps->callback_count; i++) {
> cb = ps->callbacks + i;
> cb->callback(cb->context, r == 0 ? 1 : 0);
> }
>
> ps->callback_count = 0;
> }
> }
>
prev parent reply other threads:[~2008-01-15 2:29 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-05 22:41 possible snapshot corruption after system crash with bad timing Lars Ellenberg
2008-01-15 2:29 ` Douglas McClendon [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478C1AA6.2000500@filteredperception.org \
--to=dmc.fedora@filteredperception.org \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.