git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Luciano Rocha <luciano@eurotux.com>
To: David Tweed <david.tweed@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: backups with git and inotify
Date: Mon, 10 Dec 2007 21:47:28 +0000	[thread overview]
Message-ID: <20071210214728.GB17458@bit.office.eurotux.com> (raw)
In-Reply-To: <e1dab3980712101318v264fcce5pebbb829d8cefb1ac@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5048 bytes --]

On Mon, Dec 10, 2007 at 09:18:18PM +0000, David Tweed wrote:
> Hi, looks interesting project.
> 
> I've been doing something similar-ish (but periodically rather than
> upon changes). Here are some rough off-the-cuff observations (probably
> telling you things you already know).

Thanks.

> 
> Firstly, are you doing backups (be able to restore to n previous
> states upon catastrophe) or archives (being able to lookup arbitrary
> points in history to compare with current stuff, eg, for regressions)?
> (Archiving is useful if you aren't in a disciplined enough project to
> do rewritten proper commits but still want to be able to look around
> and try to figure out what's caused regressions, etc.)

Archives, only. If the project requires a coherent state, then I don't
think that automatic commits are the way to go.

> 
> The scripts I use are at
> 
> http://www.personal.rdg.ac.uk/~sis05dst/chronoversion.tgz
> 
> but they're designed around archiving rather than backups.

Thanks, I'll take a look, and maybe borrow some ideas. :)

> 
> On Dec 10, 2007 8:29 PM, Luciano Rocha <luciano@eurotux.com> wrote:
> > The following is a work in progress. There are some problems in how I'm
> > using git and recording the history:
> >
> > 1. I use an opened fd for each monitored directory (and subdirectories),
> >    (inotify_add_watch_at would be nice).
> >    I fchdir(fd) when a change happens to register and commit it.
> 
> I thought about trying to have a daemon using inotify to record the
> git-add's/git-rm's but keeping the cron driven actual commits, and
> looked at python support module. I didn't because firstly I wasn't
> sure how far inotify scaled (the fact the Linux VFS maintainer insists
> on calling it "idiotify" doesn't inspire confidence). If it was me,
> I'd pull the git-commit outside your loop that does the git-add/git-rm
> (see later comment about emacs, etc).

I'd like to have the changes committed as soon as an application closes
its files. As I monitor a small subset of possible inotify events, I
think I shouldn't have much problems with scale. I'll have to test it
with my Maildir, though, to have a definitive answer.

> Obviously if your buffer isn't
> completely emptied you'll get a misleading granularity of commits, but
> then I guess that'll happen anyway. I think inotify drops events if an
> internal queue fills: personally I'd try to check for that and
> initiate manually scanning if I detected that happening.

Hm, that could be a problem. Maybe a periodic git-status followed by
git-add/rm, etc.. Hourly, perhaps.

> 
> > 2. git-rm dir/file also removes <dir> if file was the only entry of
> >    <dir>. So, when committing the removal, git complains that it can't
> >    find cwd. So I record the parent directory, do the git command, check
> >    if getcwd() works, and if not do the commit in the parent directory.
> >
> > 3. git-rm (empty) directory fails
> >
> > 4. Changes aren't atomic, but I can live with that and I doubt I would
> >    be able to make it atomic without implementing a filesystem (FUSE or
> >    not).
> 
> With things like emacs that do update writes by writing a new file
> with a temporary name and then copying it over the top of the old
> file, you'll get presumably 3 commits. Is that acceptable?

No. Vim also has that behaviour. I plan on accepting ignore patterns,
and maybe also parse .gitignore, and add those temporary files (*~,
.*.sw[po]), etc.) implicitly.

> 
> > I can work around most of the problems, and rewrite to use recorded path
> > names instead of directories fd, but before I do that, and while I'm
> > at the beginning, I'd like to probe for opinions and suggestions.
> 
> The only other thing that occurs to me is whether you need any greater
> support for stopping the automatic monitoring than just stopping the
> daemon. Eg, what happens if you decide you need to recover a previous
> version of a file. Git checks it out, presumably updates the index
> itself and then inotify fires off a git-add that will want to write to
> the index. Basically, I'm trying to think if there's any situation
> where you can have a delete event that git causes, followed by a
> creating some new content where delay in your program processing the
> delete will cause the new content to be `lost'? (I know, I sould read
> the code.) In chronoversion, the first thing it does is check for a
> "suppress" file which stops it doing anything automatically and I put
> one in there whenever I'm doing anything more than looking at the data
> (eg, switch branch, checkout old version, etc). But I might be being
> hyper-cautious.

I'll have to think about that. A stop/pause button is a good idea, and
checking if the tree is at HEAD. I don't think a commit of changes to a
file checked-out to a previous version will lose any information,
but I'll check.

-- 
Luciano Rocha <luciano@eurotux.com>
Eurotux Informática, S.A. <http://www.eurotux.com/>

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

  reply	other threads:[~2007-12-10 21:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-10 20:29 backups with git and inotify Luciano Rocha
2007-12-10 21:18 ` David Tweed
2007-12-10 21:47   ` Luciano Rocha [this message]
2007-12-10 21:57 ` Björn Steinbrink
2007-12-11 10:25   ` Luciano Rocha
2007-12-11 13:24     ` David Tweed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071210214728.GB17458@bit.office.eurotux.com \
    --to=luciano@eurotux.com \
    --cc=david.tweed@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).