From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Larkin Lowrey <llowrey@nuclearwinter.com>,
Duncan <1i5t5.duncan@cox.net>,
linux-btrfs@vger.kernel.org
Subject: Re: Heavy nocow'd VM image fragmentation
Date: Mon, 27 Oct 2014 08:04:56 -0400 [thread overview]
Message-ID: <544E34E8.2060103@gmail.com> (raw)
In-Reply-To: <544D2D6D.6050301@nuclearwinter.com>
[-- Attachment #1: Type: text/plain, Size: 3151 bytes --]
On 2014-10-26 13:20, Larkin Lowrey wrote:
> On 10/24/2014 10:28 PM, Duncan wrote:
>> Robert White posted on Fri, 24 Oct 2014 19:41:32 -0700 as excerpted:
>>
>>> On 10/24/2014 04:49 AM, Marc MERLIN wrote:
>>>> On Thu, Oct 23, 2014 at 06:04:43PM -0500, Larkin Lowrey wrote:
>>>>> I have a 240GB VirtualBox vdi image that is showing heavy
>>>>> fragmentation (filefrag). The file was created in a dir that was
>>>>> chattr +C'd, the file was created via fallocate and the contents of
>>>>> the orignal image were copied into the file via dd. I verified that
>>>>> the image was +C.
>>>> To be honest, I have the same problem, and it's vexing:
>>> If I understand correctly, when you take a snapshot the file goes into
>>> what I call "1COW" mode.
>> Yes, but the OP said he hadn't snapshotted since creating the file, and
>> MM's a regular that actually wrote much of the wiki documentation on
>> raid56 modes, so he better know about the snapshotting problem too.
>>
>> So that can't be it. There's apparently a bug in some recent code, and
>> it's not honoring the NOCOW even in normal operation, when it should be.
>>
>> (FWIW I'm not running any VMs or large DBs here, so don't have nocow set
>> on anything and can and do use autodefrag on all my btrfs. So I can't
>> say one way or the other, personally.)
>>
>
> Correct, there were no snapshots during VM usage when the fragmentation
> occurred.
>
> One unusual property of my setup is I have my fs on top of bcache. More
> specifically, the stack is md raid6 -> bcache -> lvm -> btrfs. When the
> fs mounts it has mount option 'ssd' due to the fact that bcache sets
> /sys/block/bcache0/queue/rotational to 0.
>
> Is there any reason why either the 'ssd' mount option or being backed by
> bcache could be responsible?
>
Two things:
First, regarding your question, the ssd mount option "shouldn't" be
responsible for this, because it is supposed to spread out allocation
only at the chunk level, not the block level, but some recent commit may
have changed that. Are you using any kind of compression in btrfs? If
so, then filefrag won't report the number of fragments correctly (it
currently reports the number of compressed blocks in the file instead),
and in fact, if you are using compression in btrfs, I would expect the
number of compressed blocks to go up as you use more space in the VM
image, long runs of zero bytes compress well, other stuff (especially
on-disk structures from encapsulated filesystems) doesn't. You might
consider putting the vm images directly on the LVM layer instead, that
tends to get much better performance in my experience than storing them
on a filesystem.
Secondly, I'd recommend switching from using bcache under LVM to using
dm-cache on top of LVM, as it makes it much easier to recover from the
various failure modes, and also to deal with a corrupted cache, due to
the fact that dm-cache doesn't put any metadata on the backing device.
It takes longer to shutdown when in write-back mode, and isn't SSD
optimized, but has also been much more reliable in my experience.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2455 bytes --]
prev parent reply other threads:[~2014-10-27 12:05 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-23 23:04 Heavy nocow'd VM image fragmentation Larkin Lowrey
2014-10-24 11:49 ` Marc MERLIN
2014-10-25 2:41 ` Robert White
2014-10-25 3:28 ` Duncan
2014-10-26 17:20 ` Larkin Lowrey
2014-10-27 6:44 ` Duncan
2014-10-27 12:04 ` Austin S Hemmelgarn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=544E34E8.2060103@gmail.com \
--to=ahferroin7@gmail.com \
--cc=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
--cc=llowrey@nuclearwinter.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox