All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: How to repair a BTRFS block?
Date: Sat, 25 Apr 2015 00:42:12 +0000 (UTC)	[thread overview]
Message-ID: <pan$680be$6c095edf$ea91e4f5$d6164a0f@cox.net> (raw)
In-Reply-To: 553A810F.6010907@gnieh.org

Martin Monperrus posted on Fri, 24 Apr 2015 19:44:47 +0200 as excerpted:

> Hi Duncan,
> 
>> The kernel log (dmesg, also logged to syslog/journald on most systems)
>> from during the scrub should capture more information on those errors.
> Thanks. The dmesg log indeed contains the file path (see below).
> 
> The error is in /home/martin/XXXXX. It is related to a low-level error
> ("failed command: READ DMA").
> 
> Beyond this corrupted file, is my disk dead?
> Can I repair the file system or re-create a new one on the same disk?

A direct answer is beyond my knowledge level, certainly without SMART 
status information, etc.  What I do know is that assuming the rest of the 
device is responding fine, most drives keep a number of reserved sectors 
available and will automatically substitute them in on a *write* to an 
affected dead sector.

So if the device in general appears to be working fine, and assuming the 
SMART status still passes, I'd backup everything else on that partition, 
unmount it, then do something like a badblocks destructive write (-w) 
test to the partition.  If it comes back clean, I'd consider the device 
usable again.

Also note that if you run smartctl -A (attributes) on the device before 
attempting anything else and check the raw value for ID 5 (reallocated 
sector count), then check again after doing something like that badblocks 
-w, you can see if it actually relocated any sectors.  Finally, note that 
while it's possible to have a one-off, once a drive starts reallocating 
sectors it often fails relatively quickly as that can indicate a failing 
media layer and once it starts to go, often it doesn't stop.  So once you 
see that value move from zero, do keep an eye on it and if you notice the 
value starting to climb, get the data off that thing as soon as possible.

And of course it should go without saying, but I'll repeat the sysadmin's 
data value rule of thumb anyway, for the benefit of others reading as 
well.  If you care about the data, by definition, you have a (tested) 
backup (a corollary rule states that an untested backup isn't a backup at 
all).  If you don't have a backup, by definition you do NOT care about 
that data, /regardless/ of any claims to the contrary.  Unfortunately, 
many (most?) people end up learning this the hard way, finding out too 
late how much more value the data had than they thought, and thus that 
they /should/ have cared about it more (more backups, more testing of 
them) than they did.

(For those who end up in that situation...)  On the flip side there's the 
big picture.  During hurricane Katrina a data hosting firm in New Orleans 
made (tech) headlines by blogging live their struggle to stay powered and 
online.  I was one of thousands watching that, along with the mainstream 
news about the flooding, looting and dying going on.  Obviously losing a 
bit of data ends up pretty far down the list when you're wet and cold and 
just lost your house and possibly members of your family!  A bit of data 
loss might hurt a bit, but in the big picture, if you're still healthy, 
and have a job and a home and family, it's /not/ the end of the world.  A 
bit of perspective helps! =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2015-04-25  0:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-18  7:45 How to repair a BTRFS block? Martin Monperrus
2015-04-23 18:05 ` Martin Monperrus
2015-04-24  3:30   ` Duncan
2015-04-24 17:44   ` Martin Monperrus
2015-04-25  0:42     ` Duncan [this message]
2015-04-25  8:11       ` Duncan
2015-04-25 17:56     ` Martin Monperrus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$680be$6c095edf$ea91e4f5$d6164a0f@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.