public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Hugo Mills <hugo@carfax.org.uk>
To: Ingvar Bogdahn <ingvar.bogdahn@googlemail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: CoW with webserver databases: innodb_file_per_table and dedicated tables for blobs?
Date: Mon, 15 Jun 2015 09:57:20 +0000	[thread overview]
Message-ID: <20150615095720.GF9850@carfax.org.uk> (raw)
In-Reply-To: <557E9C2B.9030404@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2495 bytes --]

On Mon, Jun 15, 2015 at 11:34:35AM +0200, Ingvar Bogdahn wrote:
> Hello there,
> 
> I'm planing to use btrfs for a medium-sized webserver. It is
> commonly recommended to set nodatacow for database files to avoid
> performance degradation. However, apparently nodatacow disables some
> of my main motivations of using btrfs : checksumming and (probably)
> incremental backups with send/receive (please correct me if I'm
> wrong on this). Also, the databases are among the most important
> data on my webserver, so it is particularly there that I would like
> those feature working.
> 
> My question is, are there strategies to avoid nodatacow of databases
> that are suitable and safe in a production server?
> I thought about the following:
> - in mysql/mariadb: setting "innodb_file_per_table" should avoid
> having few very big database files.

   It's not so much about the overall size of the files, but about the
write patterns, so this probably won't be useful.

> - in mysql/mariadb: adapting database schema to store blobs into
> dedicated tables.

   Probably not an issue -- each BLOB is (likely) to be written in a
single unit, which won't cause the fragmentation problems.

> - btrfs: set autodefrag or some cron job to regularly defrag only
> database fails to avoid performance degradation due to fragmentation

   Autodefrag is a good idea, and I would suggest trying that first,
before anything else, to see if it gives you good enough performance
over time.

   Running an explicit defrag will break any CoW copies you have (like
snapshots), causing them to take up additional space. For example,
start with a 10 GB subvolume. Snapshot it, and you will still only
have 10 GB of disk usage. Defrag one (or both) copies, and you'll
suddenly be using 20 GB.

> - turn on compression on either btrfs or mariadb

   Again, won't help. The issue is not the size of the data, it's the
write patterns: small random writes into the middle of existing files
will eventually cause those files to fragment, which causes lots of
seeks and short reads, which degrades performance.

> Is this likely to give me ok-ish performance? What other
> possibilities are there?

   I would recommend benchmarking over time with your workloads, and
seeing how your performance degrades.

   Hugo.

-- 
Hugo Mills             | You are not stuck in traffic: you are traffic
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                    German ad campaign

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2015-06-15  9:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-15  9:34 CoW with webserver databases: innodb_file_per_table and dedicated tables for blobs? Ingvar Bogdahn
2015-06-15  9:57 ` Hugo Mills [this message]
2015-06-16  7:06   ` Ingvar Bogdahn
2015-06-16  8:49     ` Fajar A. Nugraha
2015-06-16  9:32     ` Hugo Mills

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150615095720.GF9850@carfax.org.uk \
    --to=hugo@carfax.org.uk \
    --cc=ingvar.bogdahn@googlemail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox