linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Lennart Poettering <lennart@poettering.net>,
	Russell Coker <russell@coker.com.au>
Cc: <kreijack@inwind.it>, Duncan <1i5t5.duncan@cox.net>,
	<linux-btrfs@vger.kernel.org>,
	<systemd-devel@lists.freedesktop.org>
Subject: Re: [systemd-devel] Slow startup of systemd-journal on BTRFS
Date: Mon, 16 Jun 2014 09:05:48 -0700	[thread overview]
Message-ID: <539F15DC.4010600@fb.com> (raw)
In-Reply-To: <20140616101448.GB18016@tango.0pointer.de>



On 06/16/2014 03:14 AM, Lennart Poettering wrote:
> On Mon, 16.06.14 10:17, Russell Coker (russell@coker.com.au) wrote:
>
>>> I am not really following though why this trips up btrfs though. I am
>>> not sure I understand why this breaks btrfs COW behaviour. I mean,
>>> fallocate() isn't necessarily supposed to write anything really, it's
>>> mostly about allocating disk space in advance. I would claim that
>>> journald's usage of it is very much within the entire reason why it
>>> exists...
>>
>> I don't believe that fallocate() makes any difference to fragmentation on
>> BTRFS.  Blocks will be allocated when writes occur so regardless of an
>> fallocate() call the usage pattern in systemd-journald will cause
>> fragmentation.
>
> journald's write pattern looks something like this: append something to
> the end, make sure it is written, then update a few offsets stored at
> the beginning of the file to point to the newly appended data. This is
> of course not easy to handle for COW file systems. But then again, it's
> probably not too different from access patterns of other database or
> database-like engines...

Was waiting for you to show up before I said anything since most systemd 
related emails always devolve into how evil you are rather than what is 
actually happening.

So you are doing all the right things from what I can tell, I'm just a 
little confused about when you guys run fsync.  From what I can tell 
it's only when you open the journal file and when you switch it to 
"offline."  I didn't look too much past this point so I don't know how 
often these things happen.  Are you taking an individual message, 
writing it, updating the head of the file and then fsync'ing?  Or are 
you getting a good bit of dirty log data and fsyncing occasionally?

What would cause btrfs problems is if you fallocate(), write a small 
chunk, fsync, write a small chunk again, fsync again etc.  Fallocate 
saves you the first write around, but if the next write is within the 
same block as the previous write we'll end up triggering cow and enter 
fragmented territory.  If this is what is what journald is doing then 
that would be good to know, if not I'd like to know what is happening 
since we shouldn't be fragmenting this badly.

Like I said what you guys are doing is fine, if btrfs falls on it's face 
then its not your fault.  I'd just like an exact idea of when you guys 
are fsync'ing so I can replicate in a smaller way.  Thanks,

Josef

  parent reply	other threads:[~2014-06-16 16:06 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-12 11:13 R: Re: Slow startup of systemd-journal on BTRFS Goffredo Baroncelli <kreijack@libero.it>
2014-06-12 12:37 ` Duncan
2014-06-12 23:24   ` Dave Chinner
2014-06-13 22:19     ` Goffredo Baroncelli
2014-06-14  2:53       ` Duncan
2014-06-14  7:52         ` Goffredo Baroncelli
2014-06-15  5:43           ` Duncan
2014-06-15 22:39             ` [systemd-devel] " Lennart Poettering
2014-06-15 22:13           ` Lennart Poettering
2014-06-16  0:17             ` Russell Coker
2014-06-16  1:06               ` John Williams
2014-06-16  2:19                 ` Russell Coker
2014-06-16 10:14               ` Lennart Poettering
2014-06-16 10:35                 ` Russell Coker
2014-06-16 11:16                   ` Austin S Hemmelgarn
2014-06-16 11:56                 ` Andrey Borzenkov
2014-06-16 16:05                 ` Josef Bacik [this message]
2014-06-16 19:52                   ` Martin
2014-06-16 20:20                     ` Josef Bacik
2014-06-17  0:15                     ` Austin S Hemmelgarn
2014-06-17  1:13                     ` cwillu
2014-06-17 12:24                       ` Martin
2014-06-17 17:56                       ` Chris Murphy
2014-06-17 18:46                       ` Filipe Brandenburger
2014-06-17 19:42                         ` Goffredo Baroncelli
2014-06-17 21:12                   ` Lennart Poettering
2014-06-16 16:32             ` Goffredo Baroncelli
2014-06-16 18:47               ` Goffredo Baroncelli
2014-06-19  1:13             ` Dave Chinner
2014-06-14 10:59         ` Kai Krakow
2014-06-15  5:02           ` Duncan
2014-06-15 11:18             ` Kai Krakow
2014-06-15 21:45           ` Martin Steigerwald
2014-06-15 21:51             ` Hugo Mills
2014-06-15 22:43           ` [systemd-devel] " Lennart Poettering
2014-06-15 21:31         ` Martin Steigerwald
2014-06-15 21:37           ` Hugo Mills
2014-06-17  8:22           ` Duncan
  -- strict thread matches above, loose matches on Subject: below --
2014-06-11 21:28 Goffredo Baroncelli
2014-06-12  1:21 ` Dave Chinner
2014-06-12  1:37   ` Dave Chinner
2014-06-12  2:32     ` Chris Murphy
2014-06-15 22:34       ` [systemd-devel] " Lennart Poettering
2014-06-16  4:01         ` Chris Murphy
2014-06-16  4:38           ` cwillu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=539F15DC.4010600@fb.com \
    --to=jbacik@fb.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=kreijack@inwind.it \
    --cc=lennart@poettering.net \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=russell@coker.com.au \
    --cc=systemd-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).