All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robert White <rwhite@pobox.com>
To: Chris Murphy <lists@colorremedies.com>,
	Zygo Blaxell <zblaxell@furryterror.org>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: filesystem corruption
Date: Tue, 04 Nov 2014 14:19:36 -0800	[thread overview]
Message-ID: <545950F8.1050505@pobox.com> (raw)
In-Reply-To: <BCBD4D38-1631-418A-8F3B-16497BDEB300@colorremedies.com>

On 11/04/2014 10:28 AM, Chris Murphy wrote:
> On Nov 3, 2014, at 9:31 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
>> Now we have two disks with equal generation numbers.  Generations 6..9
>> on sda are not the same as generations 6..9 on sdb, so if we mix the
>> two disks' metadata we get bad confusion.
>>
>> It needs to be more than a sequential number.  If one of the disks
>> disappears we need to record this fact on the surviving disks, and also
>> cope with _both_ disks claiming to be the "surviving" one.
>
> I agree this is also a problem. But the most common case is where we know that sda generation is newer (larger value) and most recently modified, and sdb has not since been modified but needs to be caught up. As far as I know the only way to do that on Btrfs right now is a full balance, it doesn't catch up just be being reconnected with a normal mount.


I would think that any time any system or fraction thereof is mounted 
with both a "degraded" and "rw", status a degraded flag should be set 
somewhere/somehow in the superblock etc.

The only way to clear this flag would be to reach a "reconciled" state. 
That state could be reached in one of several ways. Removing the missing 
mirror element would be a fast reconcile, doing a balance or scrub would 
be a slow reconcile for a filessytem where all the media are returned to 
service (e.g. the missing volume of a RAID 1 etc is returned.)

Generation numbers are pretty good, but I'd put on a rider that any 
generation number or equivelant incremented while the system is degraded 
should have a unique quanta (say a GUID) generated and stored along with 
the generation number. The mere existence of this quanta would act as 
the degraded flag.

Any check/compare/access related to the generation number would know to 
notice that the GUID is in place and do the necessary resolution. If 
successful the GUID would be discarded.

As to how this could be implemented, I'm not fully conversant on the 
internal layout.

One possibility would be to add a block reference, or, indeed replace 
the current storage for generation numbers completely with block 
reference to a block containing the generation number and the potential 
GUID. The main value of having an out-of-structure reference is that its 
content is less space constrained, and it could be shared by multiple 
usages. In the case, for instance, where the block is added (as opposed 
to replacing the generation number) only one such block would be needed 
per degraded,rw mount, and it could be attached to as many filesystem 
structures as needed.


Just as metadata under DUP is divergent after a degraded mount, a 
generation block wold be divergent, and likely in a different location 
than its peers on a subsequent restored geometry.

A gerenation block could have other nicities like the date/time and the 
devices present (or absent); such information could conceivably be used 
to intellegently disambiguate references. For instance if one degraded 
mount had sda and sdb, and second had sdb and sdc, then itd be known 
that sdb was dominant for having been present every time.

  parent reply	other threads:[~2014-11-04 22:19 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-31  0:29 filesystem corruption Tobias Holst
2014-10-31  1:02 ` Tobias Holst
2014-10-31  2:41   ` Rich Freeman
2014-10-31 17:34     ` Tobias Holst
2014-11-02  4:49       ` Robert White
2014-11-02 21:57         ` Chris Murphy
2014-11-03  3:43           ` Zygo Blaxell
2014-11-03 17:11             ` Chris Murphy
2014-11-04  4:31               ` Zygo Blaxell
2014-11-04  8:25                 ` Duncan
2014-11-04 18:28                 ` Chris Murphy
2014-11-04 21:44                   ` Duncan
2014-11-04 22:19                   ` Robert White [this message]
2014-11-04 22:34                   ` Zygo Blaxell
2014-11-03  2:55         ` Tobias Holst
2014-11-03  3:49           ` Robert White
  -- strict thread matches above, loose matches on Subject: below --
2018-12-03  9:31 Filesystem Corruption Stefan Malte Schumacher
2018-12-03 11:34 ` Qu Wenruo
2018-12-03 16:29 ` remi
2011-01-03  1:58 filesystem corruption Patrick H.
2011-01-03  3:16 ` Neil Brown
     [not found]   ` <4D214B5C.3010103@feystorm.net>
2011-01-03  4:56     ` Neil Brown
2011-01-03  5:05       ` Patrick H.
2011-01-04  5:33         ` NeilBrown
2011-01-04  7:50           ` Patrick H.
2011-01-04 17:31             ` Patrick H.
2011-01-05  1:22               ` Patrick H.
2011-01-05  7:02   ` CoolCold
     [not found]   ` <AANLkTinL_nz58f8rSPuhYvVwGY5jdu1XVkNLC1ky5A65@mail.gmail.com>
2011-01-05 14:28     ` Patrick H.
2011-01-05 15:52       ` Spelic
2011-01-05 15:55         ` Patrick H.
2007-06-06  3:10 Filesystem corruption Xu CanHao
2007-06-06 12:16 ` Ingo Bormuth
2007-05-30 20:13 devsk
2007-05-30 17:22 devsk
2007-05-30 19:24 ` Toby Thain
2007-05-30 20:03 ` David Masover
2007-05-31  0:11   ` Ingo Bormuth
2007-06-02 23:10     ` Edward Shishkin
2007-06-04  2:55       ` Ingo Bormuth
2007-06-04  9:41         ` Edward Shishkin
2007-06-05 23:20           ` Ingo Bormuth
2007-05-27 13:18 Laurent CARON
2007-05-28 12:23 ` Vladimir V. Saveliev
2007-05-28 14:10   ` Laurent CARON
2007-05-28 17:13     ` Vladimir V. Saveliev
2007-05-28 17:27       ` Laurent CARON
     [not found] ` <Pine.LNX.4.64.0705280025570.10429@sheep.housecafe.de>
2007-05-28 17:31   ` Christian Kujau
2007-05-28 18:16     ` Laurent CARON
2007-05-28 23:19       ` Christian Kujau
2007-05-29  8:39       ` Vladimir V. Saveliev
     [not found] ` <465BA9AC.8040805@ultraviolet.org>
2007-05-29  8:15   ` Vladimir V. Saveliev
2007-05-29 12:36     ` Toby Thain
2007-05-30 13:25       ` David Masover
2007-05-30 16:02         ` Vladimir V. Saveliev
2007-05-30 20:06           ` David Masover
2007-05-30 16:42         ` Toby Thain
2007-05-30 19:42           ` David Masover
2007-05-30 16:08       ` Vladimir V. Saveliev
2003-08-13 16:05 Locke
2003-08-14  7:49 ` Oleg Drokin
2002-09-05 15:57 Filesystem Corruption Brian Tinsley
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-07  7:15 ` Oleg Drokin
2002-06-11 16:49   ` Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2001-02-05 16:00 Filesystem corruption Ian Chilton
2001-02-05 13:16 Ian Chilton
2001-01-31 14:20 Carsten Langgaard
2001-01-31 15:52 ` Florian Lohoff
2001-01-31 16:24   ` Carsten Langgaard
2001-01-31 16:48     ` Florian Lohoff
2001-02-05 10:02 ` Ralf Baechle
2001-02-05 12:10   ` Alan Cox
2001-02-05 12:10     ` Alan Cox
2001-02-05 12:56     ` Geert Uytterhoeven
2001-02-05 13:01       ` Alan Cox
2001-02-05 13:01         ` Alan Cox
2001-02-05 22:01         ` Ralf Baechle
2001-02-05 22:01           ` Ralf Baechle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=545950F8.1050505@pobox.com \
    --to=rwhite@pobox.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=zblaxell@furryterror.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.