linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Francesco Turco <fturco@fastmail.fm>, linux-btrfs@vger.kernel.org
Subject: Re: Frequent btrfs corruption on a USB flash drive
Date: Thu, 7 Jul 2016 11:42:22 -0400	[thread overview]
Message-ID: <d2a820b5-7c53-aef4-6e55-0ed6b1da44ec@gmail.com> (raw)
In-Reply-To: <eefaeb0f-9eb0-bd2c-3ce5-5eb28bb43c68@fastmail.fm>

On 2016-07-07 10:55, Francesco Turco wrote:
> On 2016-07-07 16:27, Austin S. Hemmelgarn wrote:
>> This seems odd, are you trying to access anything over NFS or some other
>> network filesystem protocol here?  If not, then I believe you've found a
>> bug, because I'm pretty certain we shouldn't be returning -ESTALE for
>> anything.
>
> No, I don't use NFS or any other network filesystem.
OK, I'm going to try and check the kernel code to figure out if there's 
any other case we might return that in.  I'm pretty certain that there's 
nowhere BTRFS should return that though, which means you've either hit a 
bug or have some other hardware issue (Given past experience, I think 
it's more likely that you've hit a bug).
>
>> The question here is: Do you get any data corruption when using ext4?
>> Quite often when there's a hardware issue, you won't see _any_
>> indication of it other than corrupted files when using something like
>> ext4 or XFS, but it will show up almost immediately with BTRFS because
>> we validate checksums on almost everything.  There have been at least a
>> couple of times I've found disk issues while converting from ext4 to
>> BTRFS that I didn't know existed before, and then going back was able to
>> reliable reproduce using other tools.
>>
>> Also, FWIW, badblocks is not necessarily a reliable test method for
>> flash drives, they often handle serialized reads like badblocks does
>> very well even when failing.
>
> I'm not sure. Commands don't fail explicitely when I use ext4, but I
> agree with you that I may get corruption silently nonetheless. Perhaps I
> should try to rule out an hardware problem by filling my USB flash drive
> with a large random file and then checking if its SHA-1 checksum
> corresponds to the original copy on the hard disk. But first I probably
> should backup the current Btrfs filesystem with the dd command. Can I
> proceed?
Yeah, I would suggest backing up the filesystem, be careful that you 
don't have both copies of the filesystem visible to the system at the 
same time once you've finished creating the backup copy though, as there 
are potential issues if you have both visible while trying to mount the FS.

As far as checking the drive, I'd do essentially what you had said, with 
two extra parts:
1. Calculate the checksum of the data on the drive multiple times and 
make sure that it matches each time as well as matching the original 
file (if it doesn't match the original file, but each calculation from 
the drive matches, then the issue is something in the write path only).
2. Do so multiple times so you can be sure to cover _every_ block.  Most 
flash drives have a pool of spare blocks that are used for wear 
leveling, and if the issue is in one of those, this is the only way to 
find it.

You might also try doing some testing with FIO or iozone, those tend to 
exercise a wider variety of things than stuff like badblocks or dd. 
Also, since you'll have a backup copy of the FS, you might consider 
running a destructive test with badblocks (it works a bit more reliably 
on flash devices this way, just make sure to run it multiple times too), 
both with and without the -B option (-B affects how things are buffered, 
if you see errors with it enabled but none without it, then you probably 
have some bad RAM).
>
>> Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or
>> is it just raw encryption, or even something completely different?), on
>> a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it),
>> correct?
>
> I'm using a btrfs filesystem on a GUID partition encrypted with LUKS.
> It's a Kingston USB flash drive connected directly to my desktop machine
> via USB. It's definitively not a SSD or a HDD, and I'm not using any
> adapter.
OK, that both simplifies things, and makes them a bit more complicated. 
If it had been a SSD or HDD connected through an adapter, the preferred 
method of checking would be to pull it out and put it directly in the 
system to verify the drive.  However, since it's a regular flash drive, 
if it is the drive, it will probably be significantly less expensive to 
replace.


  reply	other threads:[~2016-07-07 15:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-07 13:49 Frequent btrfs corruption on a USB flash drive Francesco Turco
2016-07-07 14:27 ` Austin S. Hemmelgarn
2016-07-07 14:55   ` Francesco Turco
2016-07-07 15:42     ` Austin S. Hemmelgarn [this message]
2016-07-07 18:25     ` Chris Murphy
2016-07-07 18:41       ` Francesco Turco
2016-07-07 17:57 ` Chris Murphy
2016-07-08 16:10   ` Francesco Turco
2016-07-08 16:53     ` Austin S. Hemmelgarn
2016-07-08 18:16       ` Henk Slager
2016-07-07 21:11 ` Andrew E. Mileski
2016-07-07 21:13   ` Francesco Turco
2016-07-07 22:38     ` Andrew E. Mileski
2016-07-07 23:07       ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2a820b5-7c53-aef4-6e55-0ed6b1da44ec@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=fturco@fastmail.fm \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).