git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Dirk Süsserott" <newsletter@dirk.my1.cc>
To: Eric Frederich <eric.frederich@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Git as a backup system?
Date: Mon, 08 Nov 2010 21:06:51 +0100	[thread overview]
Message-ID: <4CD8585B.8000802@dirk.my1.cc> (raw)
In-Reply-To: <AANLkTikcBvN+5hkcc9+xt291B4Gm+Yhe53R3qY0PNt97@mail.gmail.com>

Am 08.11.2010 19:01 schrieb Eric Frederich:
> I maintain a corporate MediaWiki installation.
> Currently I have a cron job that runs daily and tar's up the contents
> of the installation directory and runs a mysqldump.
>
> I wrote a script that untar'd the contents each backup, gunziped the
> mysql dump, and made a git commit.
> The resulting .git directory wound up being 837M, but after running a
> long (8 minute) "git gc" command, it went down to 204M.
>
> == Questions ==
> What mysqldump options would be good to use for storage in git?
> Right now I'm not passing any parameters to mysqldump and its doing
> all inserts for each table on a single huge line.
> Would git handle it better if each insert was on its own line?
>
> Are any of you using git for a backup system?  Have any tips, words of wisdom?
>
> Thanks,
> ~Eric

Hi Eric,

I also use mysqldump and Git to make backups of my databases. Indeed, it 
performs much better when each change (insert statement) is on a 
separate line. mysqldump has an option for that which I don't recommend, 
because it dramatically slows down the dump and the restore. It would 
then create separate "insert into ..." statement for each changed line.

For me the attached script worked very well: I pipe the output of 
mysqldump through the script and it simply inserts a linefeed after each 
record.

----------------------------------
#!/usr/bin/perl -p

use strict;
use warnings;

# Before:
# INSERT INTO `schliess_grund` VALUES 
(1,'Explizit'),(2,'Neuanmeldung'),(4,'Sperrung'),(3,'Timeout');
#
# After:
# INSERT INTO `schliess_grund` VALUES
#    (1,'Explizit'),
#    (2,'Neuanmeldung'),
#    (4,'Sperrung'),
#    (3,'Timeout');
if (/^(INSERT INTO .*? VALUES) (.*);$/)
{
     $_ = "$1\n     $2\n    ;\n";
     s/\),\(/\)\n    ,\(/g;
}
----------------------------------

The changeset will be much smaller. Let's call the script "wrap.pl". 
Then run the following:

----------------------------------
mysqldump --opt --routines [...] -r <outfile.tmp> <dbname>
./wrap.pl <outfile.tmp> > <outfile>; rm <outfile.tmp>
git add <outfile>
if ! git diff-index --quiet HEAD --; then
     git commit -m "Backup of ..."
fi
----------------------------------

Try it out!

Cheers,
     Dirk

  parent reply	other threads:[~2010-11-08 20:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-08 18:01 Git as a backup system? Eric Frederich
2010-11-08 18:04 ` Jonathan Nieder
2010-11-08 18:26 ` Konstantin Khomoutov
2010-11-08 20:06 ` Dirk Süsserott [this message]
2010-11-08 20:25 ` Ævar Arnfjörð Bjarmason
2010-11-09  0:17 ` Patrick Rouleau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CD8585B.8000802@dirk.my1.cc \
    --to=newsletter@dirk.my1.cc \
    --cc=eric.frederich@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).