All of lore.kernel.org
 help / color / mirror / Atom feed
From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Duncan <1i5t5.duncan@cox.net>, Nils Steinger <nst@voidptr.de>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
Date: Tue, 24 Nov 2015 07:46:19 -0500	[thread overview]
Message-ID: <56545C1B.1080507@gmail.com> (raw)
In-Reply-To: <pan$7e9cd$79273243$6e85efe7$7ee378ef@cox.net>

[-- Attachment #1: Type: text/plain, Size: 4180 bytes --]

On 2015-11-24 00:42, Duncan wrote:
> Nils Steinger posted on Mon, 23 Nov 2015 22:10:12 +0100 as excerpted:
>
>> Do we anything about what might cause a filesystem to enter a state
>> which `send` chokes on?
>> I've only seen a small sample of the corrupted files before growing
>> tired of the process and just recreating the whole thing, but all of
>> them were database files (presumably SQLite). Could it be that the files
>> were being written to during an unclean shutdown, leading to some kind
>> of corruption of the FS? Unfortunately, I was a little triggerhappy when
>> cleaning up old snapshots, so there aren't any left to aid in
>> troubleshooting this problem further…
That's OK, I've not been able to figure out much anyway, despite the 
case of this I had about a month ago with about 200 different files 
hitting the issue (I had written a script at that time to automate 
fixing it, but haven't been able to find it for some reason), and the 
other cases I've had on my systems over the past year (I only started 
using send about a year ago for backups).  It might be worth noting that 
you're the first person who's directly reported this (I would have, but 
I hate to report stuff that isn't a critical data safety issue without a 
reliable reproducer).
>
> Austin's the one attempting to trace down the problem, so he'd have the
> most direct answer there.  (My use-case doesn't involve snapshotting or
> send/receive at all.)
I stopped using send/receive for backups after hitting this for what I 
think is the seventh time in the past year about a month ago (I still 
use snapshots for backups, but now I use them to generate SquashFS 
images (I really don't care about the block layout or inode numbers or 
most of the BTRFS related properties), which preserves my desire to have 
bootable backups, and also saves significant storage space both locally 
and on the cloud storage services I use for off-site backups (and in 
turn saves money on those too)).  I am still trying to pull together 
something to reliably reproduce this though, as I still use send/receive 
for some things (like cloning VM's without taking them offline or 
hitting the issues with block copies of a BTRFS filesystem).
>
> But if any type of files would be likely to create issues, it'd be
> something like database or VM image files, since the random-file-rewrite-
> pattern they typically have is in general the most problematic for copy-
> on-write (COW) filesystems such as btrfs.  Without some sort of
> additional fragmentation management (like the autodefrag mount option),
> these files will end up _highly_ fragmented on btrfs, often thousands of
> fragments, tens of thousands when the files in question are multi-gig.
In general, I've seen this mostly with three types of files:
1. Database files and VM images (In my experience, this has been the 
majority of the issue on filesystems that have them.  Autodefrag doesn't 
seem to help, at least, not for SQLite or BerkDB/GDBM databases).
2. Shared libraries and executables (these are the majority of the issue 
on filesystems without databases or VM images, although I can't for the 
life of me figure out why, as they are usually written to very infrequently)
3. Plain text configuration files.

For example, the last time I had this happen, it was on the root 
filesystem of one of my systems, and about a third of the problem files 
were either in /etc or text files under /usr/share, while the remaining 
2 thirds were mostly stuff under /usr/lib and /lib.  It's probably worth 
noting also that I've never seen certain files trigger this that I would 
expect to based on the above info, in particular:
1. ClamAV virus databases (IIRC, these are similar in structure to 
SQLite DB's).
2. BOINC applications.
3. Almost anything in /usr/libexec (stuff like GCC and binutils).
4. Almost any kind of script.
It's probably also worth noting that I occasionally see inconsistencies 
in database files that cause this to happen, but have never seen any 
corruption in any other types of file, so it doesn't seem to have an 
impact on data safety.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

  reply	other threads:[~2015-11-24 12:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-22 21:59 btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors Nils Steinger
2015-11-23  5:49 ` Duncan
2015-11-23 12:26 ` Austin S Hemmelgarn
2015-11-23 21:10   ` Nils Steinger
2015-11-24  5:42     ` Duncan
2015-11-24 12:46       ` Austin S Hemmelgarn [this message]
2015-11-24 18:48         ` Christoph Anton Mitterer
2015-11-24 20:44           ` Austin S Hemmelgarn
2015-11-24 20:50             ` Christoph Anton Mitterer
2015-11-24 20:58               ` Austin S Hemmelgarn
2015-11-24 21:17                 ` Christoph Anton Mitterer
2015-11-24 21:27                   ` Hugo Mills
2015-11-24 21:36                     ` Christoph Anton Mitterer
2015-11-24 22:08                       ` Hugo Mills
2015-11-26 15:44                     ` Duncan
2015-11-24 21:11 ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56545C1B.1080507@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nst@voidptr.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.