From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: BTRFS for OLTP Databases
Date: Tue, 7 Feb 2017 15:47:43 -0500 [thread overview]
Message-ID: <7c1a67ce-a62c-36e1-d228-9a1e15e4d16c@gmail.com> (raw)
In-Reply-To: <20170207213614.5fd40981@jupiter.sol.kaishome.de>
On 2017-02-07 15:36, Kai Krakow wrote:
> Am Tue, 7 Feb 2017 09:13:25 -0500
> schrieb Peter Zaitsev <pz@percona.com>:
>
>> Hi Hugo,
>>
>> For the use case I'm looking for I'm interested in having snapshot(s)
>> open at all time. Imagine for example snapshot being created every
>> hour and several of these snapshots kept at all time providing quick
>> recovery points to the state of 1,2,3 hours ago. In such case (as I
>> think you also describe) nodatacow does not provide any advantage.
>
> Out of curiosity, I see one problem here:
>
> If you're doing snapshots of the live database, each snapshot leaves
> the database files like killing the database in-flight. Like shutting
> the system down in the middle of writing data.
>
> This is because I think there's no API for user space to subscribe to
> events like a snapshot - unlike e.g. the VSS API (volume snapshot
> service) in Windows. You should put the database into frozen state to
> prepare it for a hotcopy before creating the snapshot, then ensure all
> data is flushed before continuing.
Correct.
>
> I think I've read that btrfs snapshots do not guarantee single point in
> time snapshots - the snapshot may be smeared across a longer period of
> time while the kernel is still writing data. So parts of your writes
> may still end up in the snapshot after issuing the snapshot command,
> instead of in the working copy as expected.
Also correct AFAICT, and this needs to be better documented (for most
people, the term snapshot implies atomicity of the operation).
>
> How is this going to be addressed? Is there some snapshot aware API to
> let user space subscribe to such events and do proper preparation? Is
> this planned? LVM could be a user of such an API, too. I think this
> could have nice enterprise-grade value for Linux.
Ideally, such an API should be in the VFS layer, not just BTRFS.
Reflinking exists in other filesystems already, it's only a matter of
time before they decide to do snapshotting too.
>
> XFS has xfs_freeze and xfs_thaw for this, to prepare LVM snapshots. But
> still, also this needs to be integrated with MySQL to properly work. I
> once (years ago) researched on this but gave up on my plans when I
> planned database backups for our web server infrastructure. We moved to
> creating SQL dumps instead, although there're binlogs which can be used
> to recover to a clean and stable transactional state after taking
> snapshots. But I simply didn't want to fiddle around with properly
> cleaning up binlogs which accumulate horribly much space usage over
> time. The cleanup process requires to create a cold copy or dump of the
> complete database from time to time, only then it's safe to remove all
> binlogs up to that point in time.
Sadly, freezefs (the generic interface based off of xfs_freeze) only
works for block device snapshots. Filesystem level snapshots need the
application software to sync all it's data and then stop writing until
the snapshot is complete.
As of right now, the sanest way I can come up with for a database server
is to find a way to do a point-in-time SQL dump of the database (this
also has the advantage that it works as a backup, and decouples you from
the backing storage format).
next prev parent reply other threads:[~2017-02-07 20:55 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-07 13:53 BTRFS for OLTP Databases Peter Zaitsev
2017-02-07 14:00 ` Hugo Mills
2017-02-07 14:13 ` Peter Zaitsev
2017-02-07 15:00 ` Timofey Titovets
2017-02-07 15:09 ` Austin S. Hemmelgarn
2017-02-07 15:20 ` Timofey Titovets
2017-02-07 15:43 ` Austin S. Hemmelgarn
2017-02-07 21:14 ` Kai Krakow
2017-02-07 16:22 ` Lionel Bouton
2017-02-07 19:57 ` Roman Mamedov
2017-02-07 20:36 ` Kai Krakow
2017-02-07 20:44 ` Lionel Bouton
2017-02-07 20:47 ` Austin S. Hemmelgarn [this message]
2017-02-07 21:25 ` Lionel Bouton
2017-02-07 21:35 ` Kai Krakow
2017-02-07 22:27 ` Hans van Kranenburg
2017-02-08 19:08 ` Goffredo Baroncelli
[not found] ` <b0de25a7-989e-d16a-2ce6-2b6c1edde08b@gmail.com>
2017-02-13 12:44 ` Austin S. Hemmelgarn
2017-02-13 17:16 ` linux-btrfs
2017-02-07 19:31 ` Peter Zaitsev
2017-02-07 19:50 ` Austin S. Hemmelgarn
2017-02-07 20:19 ` Kai Krakow
2017-02-07 20:27 ` Austin S. Hemmelgarn
2017-02-07 20:54 ` Kai Krakow
2017-02-08 12:12 ` Austin S. Hemmelgarn
2017-02-08 2:11 ` Peter Zaitsev
2017-02-08 12:14 ` Martin Raiber
2017-02-08 13:00 ` Adrian Brzezinski
2017-02-08 13:08 ` Austin S. Hemmelgarn
2017-02-08 13:26 ` Martin Raiber
2017-02-08 13:32 ` Austin S. Hemmelgarn
2017-02-08 14:28 ` Adrian Brzezinski
2017-02-08 13:38 ` Peter Zaitsev
2017-02-07 14:47 ` Peter Grandi
2017-02-07 15:06 ` Austin S. Hemmelgarn
2017-02-07 19:39 ` Kai Krakow
2017-02-07 19:59 ` Austin S. Hemmelgarn
2017-02-07 18:27 ` Jeff Mahoney
2017-02-07 18:59 ` Peter Zaitsev
2017-02-07 19:54 ` Austin S. Hemmelgarn
2017-02-07 20:40 ` Peter Zaitsev
2017-02-07 22:08 ` Hans van Kranenburg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7c1a67ce-a62c-36e1-d228-9a1e15e4d16c@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).