git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "David Tweed" <david.tweed@gmail.com>
To: "Luciano Rocha" <luciano@eurotux.com>
Cc: git@vger.kernel.org
Subject: Re: backups with git and inotify
Date: Mon, 10 Dec 2007 21:18:18 +0000	[thread overview]
Message-ID: <e1dab3980712101318v264fcce5pebbb829d8cefb1ac@mail.gmail.com> (raw)
In-Reply-To: <20071210202911.GA14738@bit.office.eurotux.com>

Hi, looks interesting project.

I've been doing something similar-ish (but periodically rather than
upon changes). Here are some rough off-the-cuff observations (probably
telling you things you already know).

Firstly, are you doing backups (be able to restore to n previous
states upon catastrophe) or archives (being able to lookup arbitrary
points in history to compare with current stuff, eg, for regressions)?
(Archiving is useful if you aren't in a disciplined enough project to
do rewritten proper commits but still want to be able to look around
and try to figure out what's caused regressions, etc.)

The scripts I use are at

http://www.personal.rdg.ac.uk/~sis05dst/chronoversion.tgz

but they're designed around archiving rather than backups.

On Dec 10, 2007 8:29 PM, Luciano Rocha <luciano@eurotux.com> wrote:
> The following is a work in progress. There are some problems in how I'm
> using git and recording the history:
>
> 1. I use an opened fd for each monitored directory (and subdirectories),
>    (inotify_add_watch_at would be nice).
>    I fchdir(fd) when a change happens to register and commit it.

I thought about trying to have a daemon using inotify to record the
git-add's/git-rm's but keeping the cron driven actual commits, and
looked at python support module. I didn't because firstly I wasn't
sure how far inotify scaled (the fact the Linux VFS maintainer insists
on calling it "idiotify" doesn't inspire confidence). If it was me,
I'd pull the git-commit outside your loop that does the git-add/git-rm
(see later comment about emacs, etc). Obviously if your buffer isn't
completely emptied you'll get a misleading granularity of commits, but
then I guess that'll happen anyway. I think inotify drops events if an
internal queue fills: personally I'd try to check for that and
initiate manually scanning if I detected that happening.

> 2. git-rm dir/file also removes <dir> if file was the only entry of
>    <dir>. So, when committing the removal, git complains that it can't
>    find cwd. So I record the parent directory, do the git command, check
>    if getcwd() works, and if not do the commit in the parent directory.
>
> 3. git-rm (empty) directory fails
>
> 4. Changes aren't atomic, but I can live with that and I doubt I would
>    be able to make it atomic without implementing a filesystem (FUSE or
>    not).

With things like emacs that do update writes by writing a new file
with a temporary name and then copying it over the top of the old
file, you'll get presumably 3 commits. Is that acceptable?

> I can work around most of the problems, and rewrite to use recorded path
> names instead of directories fd, but before I do that, and while I'm
> at the beginning, I'd like to probe for opinions and suggestions.

The only other thing that occurs to me is whether you need any greater
support for stopping the automatic monitoring than just stopping the
daemon. Eg, what happens if you decide you need to recover a previous
version of a file. Git checks it out, presumably updates the index
itself and then inotify fires off a git-add that will want to write to
the index. Basically, I'm trying to think if there's any situation
where you can have a delete event that git causes, followed by a
creating some new content where delay in your program processing the
delete will cause the new content to be `lost'? (I know, I sould read
the code.) In chronoversion, the first thing it does is check for a
"suppress" file which stops it doing anything automatically and I put
one in there whenever I'm doing anything more than looking at the data
(eg, switch branch, checkout old version, etc). But I might be being
hyper-cautious.

-- 
cheers, dave tweed__________________________
david.tweed@gmail.com
Rm 124, School of Systems Engineering, University of Reading.
"we had no idea that when we added templates we were adding a Turing-
complete compile-time language." -- C++ standardisation committee

  reply	other threads:[~2007-12-10 21:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-10 20:29 backups with git and inotify Luciano Rocha
2007-12-10 21:18 ` David Tweed [this message]
2007-12-10 21:47   ` Luciano Rocha
2007-12-10 21:57 ` Björn Steinbrink
2007-12-11 10:25   ` Luciano Rocha
2007-12-11 13:24     ` David Tweed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1dab3980712101318v264fcce5pebbb829d8cefb1ac@mail.gmail.com \
    --to=david.tweed@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=luciano@eurotux.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).