* How to make git diff-* ignore some patterns?
@ 2009-11-21 16:40 Dirk Süsserott
2009-11-21 17:31 ` Michael J Gruber
2009-11-21 18:07 ` Björn Steinbrink
0 siblings, 2 replies; 4+ messages in thread
From: Dirk Süsserott @ 2009-11-21 16:40 UTC (permalink / raw)
To: Git Mailing List
Hi list,
is there a way to tell "git diff-index" to ignore some special patterns,
such that /^-- Dump completed on .*$/ is NOT recognized as a difference
and "git diff-index" returns 0 if that's the only difference?
-- Dirk
<Background>
I have a mySQL database which I backup daily using mysqldump (cronjob).
The result is a text file (*.sql) with all the "create" and "insert"
statements and some metadata.
I used to use tar and gzip to backup these files and got a huge
collection of backups in the last tree years (500+ MB).
Then I switched to Git and recorded only the diffs between day X and day
X-1. My repository shrunk to 16 MB for the very same data, which was great!
My database doesn't change every day, but I backup it anway and store
the backup files with Git and a cronjob. It does:
---------------
mysqldump ... -r <backupfile> # that's the output file ;-)
git add <backupfile>
if ! git diff-index --quiet HEAD --; then
git commit -m "Backup of <database> at <timestamp>"
fi
---------------
This way, a new commit is only done when the backupfile has changed. So
far, so perfect.
A few days ago my web hoster (where the database actually resides)
changed the mySQL version.
mysqldump now writes "-- Dump completed on <timestamp>" to the file and
Git correctly recognizes this as a change and my script creates a new
commit. Every day, even if only that line has changed.
I'd like to skip these commits if only the "Dump completed" line has
changed.
</Background>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: How to make git diff-* ignore some patterns?
2009-11-21 16:40 How to make git diff-* ignore some patterns? Dirk Süsserott
@ 2009-11-21 17:31 ` Michael J Gruber
2009-11-21 18:07 ` Björn Steinbrink
1 sibling, 0 replies; 4+ messages in thread
From: Michael J Gruber @ 2009-11-21 17:31 UTC (permalink / raw)
To: Dirk Süsserott; +Cc: Git Mailing List
Dirk Süsserott venit, vidit, dixit 21.11.2009 17:40:
> Hi list,
>
> is there a way to tell "git diff-index" to ignore some special patterns,
> such that /^-- Dump completed on .*$/ is NOT recognized as a difference
> and "git diff-index" returns 0 if that's the only difference?
>
> -- Dirk
>
> <Background>
> I have a mySQL database which I backup daily using mysqldump (cronjob).
> The result is a text file (*.sql) with all the "create" and "insert"
> statements and some metadata.
> I used to use tar and gzip to backup these files and got a huge
> collection of backups in the last tree years (500+ MB).
> Then I switched to Git and recorded only the diffs between day X and day
> X-1. My repository shrunk to 16 MB for the very same data, which was great!
>
> My database doesn't change every day, but I backup it anway and store
> the backup files with Git and a cronjob. It does:
>
> ---------------
> mysqldump ... -r <backupfile> # that's the output file ;-)
> git add <backupfile>
> if ! git diff-index --quiet HEAD --; then
> git commit -m "Backup of <database> at <timestamp>"
> fi
> ---------------
>
> This way, a new commit is only done when the backupfile has changed. So
> far, so perfect.
> A few days ago my web hoster (where the database actually resides)
> changed the mySQL version.
> mysqldump now writes "-- Dump completed on <timestamp>" to the file and
> Git correctly recognizes this as a change and my script creates a new
> commit. Every day, even if only that line has changed.
>
> I'd like to skip these commits if only the "Dump completed" line has
> changed.
> </Background>
Is the dump guaranteed to be in a specific order? If yes then this
procedure makes sense. (pdfs etc. are problematic because of reordering.)
You can either egrep -v through the output of git diff-index, or define
a diff driver: set an attribute, say "dumpdiff", for dump files (see
gitattributes) and define diff driver as
git config diff.dumpdiff.textconv = dumpdiff.sh
where dumpdiff.sh is "egrep -v ...". You may need to call diff-index
with --ext-diff. I haven't tried, though ;)
Cheers,
Michael
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How to make git diff-* ignore some patterns?
2009-11-21 16:40 How to make git diff-* ignore some patterns? Dirk Süsserott
2009-11-21 17:31 ` Michael J Gruber
@ 2009-11-21 18:07 ` Björn Steinbrink
2009-11-22 15:51 ` Dirk Süsserott
1 sibling, 1 reply; 4+ messages in thread
From: Björn Steinbrink @ 2009-11-21 18:07 UTC (permalink / raw)
To: Dirk Süsserott; +Cc: Git Mailing List
On 2009.11.21 17:40:14 +0100, Dirk Süsserott wrote:
> is there a way to tell "git diff-index" to ignore some special
> patterns, such that /^-- Dump completed on .*$/ is NOT recognized as
> a difference and "git diff-index" returns 0 if that's the only
> difference?
If you don't mind losing that line, you could use a clean filter via
.gitattributes:
echo '*.sql filter=mysql_dump' >> .gitattributes
git config filter.mysql_dump.clean "sed -e '/^-- Dump completed on .*$/d'"
That way, git will filter all *.sql paths through that sed command
before storing them as blobs, dropping that "Dump completed" line from
the data stored in the repo.
Björn
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How to make git diff-* ignore some patterns?
2009-11-21 18:07 ` Björn Steinbrink
@ 2009-11-22 15:51 ` Dirk Süsserott
0 siblings, 0 replies; 4+ messages in thread
From: Dirk Süsserott @ 2009-11-22 15:51 UTC (permalink / raw)
To: Björn Steinbrink; +Cc: Git Mailing List, git
Am 21.11.2009 19:07 schrieb Björn Steinbrink:
> On 2009.11.21 17:40:14 +0100, Dirk Süsserott wrote:
>> is there a way to tell "git diff-index" to ignore some special
>> patterns, such that /^-- Dump completed on .*$/ is NOT recognized as
>> a difference and "git diff-index" returns 0 if that's the only
>> difference?
>
> If you don't mind losing that line, you could use a clean filter via
> .gitattributes:
>
> echo '*.sql filter=mysql_dump' >> .gitattributes
> git config filter.mysql_dump.clean "sed -e '/^-- Dump completed on .*$/d'"
>
> That way, git will filter all *.sql paths through that sed command
> before storing them as blobs, dropping that "Dump completed" line from
> the data stored in the repo.
>
> Björn
>
Thank you Björn and Michael,
Your suggestions were really helpful. I decided to use Björn's 'clean
filter' approach. It works great.
-- Dirk
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-11-22 15:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-21 16:40 How to make git diff-* ignore some patterns? Dirk Süsserott
2009-11-21 17:31 ` Michael J Gruber
2009-11-21 18:07 ` Björn Steinbrink
2009-11-22 15:51 ` Dirk Süsserott
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox