linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fusionio.com>
To: John Williams <jwilliams4200@gmail.com>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: Why does btrfs benchmark so badly in this case?
Date: Thu, 8 Aug 2013 16:38:55 -0400	[thread overview]
Message-ID: <20130808203855.GI16712@localhost.localdomain> (raw)
In-Reply-To: <CAJBj3vfrHPDhM579JZDhUQgmyd2PnUtrgwYX0718tOrK8MzNCA@mail.gmail.com>

On Thu, Aug 08, 2013 at 01:23:22PM -0700, John Williams wrote:
> On Thu, Aug 8, 2013 at 12:40 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Thu, Aug 08, 2013 at 09:13:04AM -0700, John Williams wrote:
> >> Phoronix periodically runs benchmarks on filesystems, and one thing I
> >> have noticed is that btrfs always does terribly on their fio "Intel
> >> IOMeter fileserver access pattern" benchmark:
> >>
> >> http://www.phoronix.com/scan.php?page=article&item=linux_310_10fs&num=2
> 
> > So the reason this workload sucks for btrfs is because we fall back on buffered
> > IO because fio does not do block size aligned writes for this workload.  If you
> > add
> >
> > ba=4k
> >
> > to the iometer fio file then we go the same speed as xfs and ext4.  Not a whole
> > lot we can do about this since unaligned writes means we have to read in pages
> > to cow the block properly, which is why we fall back to buffered.  Once we do
> > that we end up having a lot of page locking stuff that gets in the way and makes
> > us twice as slow.  Thanks,
> 
> Thanks for looking into it.
> 
> So I guess the reason that ZFS does well with that workload is that
> ZFS is using smaller blocks, maybe just 512B ?
> 

Yeah I'm not sure what ZFS does, but if you are writing over a block and the
size/offset isn't aligned then you'd see similar issues with ZFS since it would
have to read+modify+write.  It is likely that ZFS just is using a smaller
blocksize.

> I wonder how common these type of non-4K aligned workloads are.
> Apparently, people with such workloads should avoid btrfs, but maybe
> these types of workloads are very rare?

So most people who use AIO/O_DIRECT have really specific setups which generally
can adjust how they align stuff (databases for example this would be the db page
and those are usually large, like 16k-32k), or with virtual images which will
hopefully be doing things in block aligned io's, but this depends on the host
OS.  Like I said there isn't a whole lot we can do about this, you can do NOCOW
if you want to get around it without changing your application or you can change
the app to be blocksize aligned.  Thanks,

Josef

  reply	other threads:[~2013-08-08 20:38 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-08 16:13 Why does btrfs benchmark so badly in this case? John Williams
2013-08-08 17:29 ` Josef Bacik
2013-08-08 18:37 ` Clemens Eisserer
2013-08-08 19:40 ` Josef Bacik
2013-08-08 20:23   ` John Williams
2013-08-08 20:38     ` Josef Bacik [this message]
2013-08-09 21:35       ` Kai Krakow
2013-08-12 13:48         ` Josef Bacik
2013-08-08 20:59     ` Chris Murphy
2013-08-08 21:25       ` Zach Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130808203855.GI16712@localhost.localdomain \
    --to=jbacik@fusionio.com \
    --cc=jwilliams4200@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).