Re: corruption in USB harddrive - backup via send/receive - question

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: "Miguel Negrão" <miguel.negrao-lists@friendlyvirus.org>,
	linux-btrfs@vger.kernel.org
Subject: Re: corruption in USB harddrive - backup via send/receive - question
Date: Fri, 17 Apr 2015 07:31:52 -0400	[thread overview]
Message-ID: <5530EF28.4060209@gmail.com> (raw)
In-Reply-To: <loom.20150416T202016-456@post.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 5108 bytes --]

On 2015-04-16 14:48, Miguel Negrão wrote:
> Hello,
>
> I'm running a laptop, macbook pro 8,2, with ubuntu, on kernel
> 3.13.0-49-lowlatency. I have a USB enclosure containing two harddrives
> (Icydock JBOD). Each harddrive runs their own btrfs file system, on top of
> luks partitions. I backup one harddrive to the other using btrfs
> send/receive with incremental sends (tests that I did indicated this setup
> was too fragile for running btrfs RAID).
>
> I've noticed that files on one of the harddrive start to get corrupted
> sometimes. It's not many files, but it does happen from time to time. On the
> irc I was told it could be the USB enclosure, it could be memory, etc. The
> SMART data of the harddrives say they are fine, the quick SMART tests also
> pass without problems.
>
>
>   - Given that I'm running a laptop and comunicating with the harddrives via
> USB, is it expected that I will get some corruption from time to time or is
> this abnormal and there is something very wrong with some of my equipment
> and if so how can track what is responsible ?
>   - Is it possible to extract a file that has csum errors ? I work with audio
> files, if I don't have a backup of file I would still like to get full
> corrupted version, since most of the audio might still be perfectly fine.
> Can I tell btrfs to do a new csum of the file has it is now, and just live
> with the corruption ?
>
> I've copied a file to the main USB harddrive on 2015-02-21, the file was
> backed up to the other harddrive via send/receive on 2015-02-23. Now
> (yesterday) when I try to access the file on the main harddrive it is corrupted:
>
> Apr 16 19:20:35 miguel-MacBookPro kernel: [  835.944606] BTRFS info (device
> dm-1): csum failed ino 136726 off 1067679744 csum 4135207512 expected csum
> 1128560616
> Apr 16 19:20:35 miguel-MacBookPro kernel: [  835.948431] BTRFS info (device
> dm-1): csum failed ino 136726 off 1067761664 csum 730461863 expected csum
> 1924299628
> Apr 16 19:20:36 miguel-MacBookPro kernel: [  836.395372] BTRFS info (device
> dm-1): csum failed ino 136726 off 1067679744 csum 4135207512 expected csum
> 1128560616
> Apr 16 19:20:36 miguel-MacBookPro kernel: [  836.396682] BTRFS info (device
> dm-1): csum failed ino 136726 off 1067679744 csum 4135207512 expected csum
> 1128560616
>
> I can access it fine on the backup harddrive.
>
> Questions:
>
> - Can I assume that that the corruption happened after the file was sent to
> the backup hardrive ?
> - Will btrfs send ever send a file with corrupted blocks ?
> - I kept running more backups, but that particular file was not changed
> since. I'm I correct in assuming that since the file was not changed it was
> not sent again to the backup disk and that therefore the version I have in
> the backup should be a good copy ?
>
> Best regards,
> Miguel
>
> Label: 'huge-new'  uuid: 21d841c9-7c30-4d1b-b4c2-8c0e59e8959a
> 	Total devices 1 FS bytes used 1.04TiB
> 	devid    1 size 2.73TiB used 1.06TiB path /dev/mapper/huge-new
>
> [/dev/mapper/huge-new].write_io_errs   0
> [/dev/mapper/huge-new].read_io_errs    0
> [/dev/mapper/huge-new].flush_io_errs   0
> [/dev/mapper/huge-new].corruption_errs 1970
> [/dev/mapper/huge-new].generation_errs 0
>
> Btrfs v0.20-rc1-335-gf00dd83
>
> Label: 'huge-new-backup'  uuid: 9af299bc-48b0-4e52-8078-82749627d9f4
> 	Total devices 1 FS bytes used 1.04TiB
> 	devid    1 size 2.73TiB used 1.05TiB path /dev/mapper/huge-new-backup
>
> [/dev/mapper/huge-new-backup].write_io_errs   0
> [/dev/mapper/huge-new-backup].read_io_errs    0
> [/dev/mapper/huge-new-backup].flush_io_errs   0
> [/dev/mapper/huge-new-backup].corruption_errs 0
> [/dev/mapper/huge-new-backup].generation_errs 0
>
> Btrfs v0.20-rc1-335-gf00dd83
>

First, as mentioned in another reply to this, you should update your 
kernel.  I don't think that the kernel is what is causing the issue, but 
it is an old kernel by BTRFS standards, and keeping up to date is 
important with a filesystem under such heavy development.  The same 
actually goes for the userspace components as well, although that is 
less critical than the kernel side.

As to the corruption, this sounds like some kind of hardware issue to 
me.  Assuming that you can afford to wipe the filesystems, I would 
suggest running some tests on the disks with the program 'badblocks' 
(found in the e2fsutils).  The fact that it is only the first disk that 
is having issues would seem to indicate that either that port on the 
enclosure is intermitently bad, or the disk itself is having issues. 
The SMART tests passing just indicate that the disk doesn't think it is 
failing, not that it is perfectly reliable (I've had disks that pass all 
the SMART tests, and then just randomly reset themselves from time to 
time).  I would also look into what manufacturer and firmware version 
the drives are, as I do know that some of the early Seagate and WD 
multi-terabyte drives had some serious firmware bugs that could cause 
data corruption similar to this.



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

next prev parent reply	other threads:[~2015-04-17 11:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-16 18:48 corruption in USB harddrive - backup via send/receive - question Miguel Negrão
2015-04-16 20:06 ` Marc MERLIN
2015-04-16 20:58   ` Miguel Negrão
2015-04-17 11:31 ` Austin S Hemmelgarn [this message]
2015-04-17 19:45   ` Miguel Negrão
2015-04-20 14:07 ` Sander
2015-04-20 14:29   ` Miguel Negrão

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5530EF28.4060209@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=miguel.negrao-lists@friendlyvirus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.