Re: [dm-crypt] Help request - Michael Kjörling

public inbox for dm-crypt@saout.de
 help / color / mirror / Atom feed

From: "Michael Kjörling" <michael@kjorling.se>
To: dm-crypt@saout.de
Subject: Re: [dm-crypt] Help request
Date: Tue, 7 Jul 2020 20:58:14 +0000	[thread overview]
Message-ID: <a18449a3-644e-4478-96dc-268ed9a5d70c@localhost> (raw)
In-Reply-To: <3vDyxLyWr_rNbnR7rY0pHQ1ax-duBWq_KHokpmZ9DtfRU5AAWUbnKZSwNrX2isMmPfs4DQk9DMBz4Y-o9QNsVjHOHvLguizYpfIV-4vKSxw=@protonmail.com>

On 6 Jul 2020 03:36 +0000, from lacedaemonius1@protonmail.com (lacedaemonius):
> [ 643.631782] print_req_error: critical target error, dev sdi, sector 11721044993
> [ 643.631789] Buffer I/O error on dev sdi, logical block 11721044993, async page read

Notice that the errors are occuring on the raw device, not through a
dm-* mapping. That sector address is just past the 6 TB (about 5.46
TiB) mark; does that sound reasonable given the drive size? (It would
if the physical drive is _more_ than 6 TB in size, and it might if the
drive is advertised as 6 TB.) Assuming that the problematic drive is
still detected as sdi, what's the contents of /sys/block/sdi/size?
(That should be _at least_ 11721044993; otherwise, some metadata
somewhere has been corrupted.)

If you luksOpen the LUKS container and "file -Ls" the corresponding
file in /dev/mapper, then what is the output of that? It should
indicate an ext4 file system in your case.

If that too fails, then I would suggest a pass of ddrescue reading
from the raw backing device and writing to /dev/null. (If you do this,
make VERY VERY SURE that you get the order right!) That will tell you
whether the data on the drive itself can be read without errors. If
you have enough storage elsewhere to make a copy of the whole contents
of the drive, strongly consider writing it there instead of throwing
it away; it can't hurt, and it might help. If you do this, expect it
to take the better part of a day to complete. (6 TB at 100 MB/s is
16-17 hours; you haven't specified the drive size, and 100 MB/s is a
reasonable average for a 7200 rpm rotational drive.)

That you're seeing delays of several seconds for those reads, and
user-visible delays of more than that, suggests to me that it's not
just an out-of-bounds read command issued to the drive, which should
return more or less immediately with something like sector not found,
which in turn would be propagated as an I/O error.

Is the LUKS container LUKS 1 or LUKS 2? Is the drive GPT partitioned,
or something else?

> I don't think it's a drive failure because it's only a few months
> old and I haven't got any SMART warnings, so that leaves software.

Unfortunately, drives can fail without reporting failures in SMART
data, and they can fail early. While the probability of either is
_lower_, it is non-zero.

An in-use drive failing certainly can cause issues to the running
system. A drive failing but not holding swap or a critical file system
_shouldn't_ cause the kernel to crash, but I wouldn't completely rule
out the possibility.

The fact that the LUKS container was not closed _should_ not cause any
issues after a reboot, because closing the container really just
removes bookkeeping information and cryptographic keys from kernel
memory; it doesn't affect on-disk data. An unclean shutdown isn't
ideal for ext4, but it's usually not catastrophic.

> Is it worth making any attempt at trying to recover the drive and if
> so is there any documentation that explains what to do? I don't have
> a backup of the LUKS header, if that's the problem.

Do you have a recent backup of the data on the drive, or does the
drive that is giving you problems hold the only copy? Is it data that
you care a lot about, or can it be easily restored from other sources?
(This basically boils down to: how important is it to rescue the data
in-place?)

-- 
Michael Kjörling • https://michael.kjorling.se • michael@kjorling.se
 “Remember when, on the Internet, nobody cared that you were a dog?”

next prev parent reply	other threads:[~2020-07-07 21:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-06  3:36 [dm-crypt] Help request lacedaemonius
2020-07-07 20:58 ` Michael Kjörling [this message]
2020-07-07 23:43   ` Arno Wagner
2020-07-08  0:43   ` Robert Nichols

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a18449a3-644e-4478-96dc-268ed9a5d70c@localhost \
    --to=michael@kjorling.se \
    --cc=dm-crypt@saout.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox