All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Tim Hutt <tdhutt@gmail.com>, git@vger.kernel.org
Subject: Re: Monitoring a repository for changes
Date: Wed, 21 Jun 2017 19:52:52 +0000	[thread overview]
Message-ID: <20170621195252.GA31582@starla> (raw)
In-Reply-To: <87efud8jrn.fsf@gmail.com>

Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Wed, Jun 21 2017, Tim Hutt jotted:
> 
> > Hi,
> >
> > Currently if you want to monitor a repository for changes there are
> > three options:
> >
> > * Polling - run a script to check for updates every 60 seconds.
> > * Server side hooks
> > * Web hooks (on Github, Bitbucket etc.)
> >
> > Unfortunately for many (most?) cases server-side hooks and web hooks
> > are not suitable. They require you to both have admin access to the
> > repo and have a public server available to push updates to. That is a
> > huge faff when all I want to do is run some local code when a repo is
> > updated (e.g. play a sound).

Yeah, it kinda sucks that way.

Currently, for one of my public-inbox mirrors which has ssh
access to the primary server on public-inbox.org, I have:

	#!/bin/sh
	while true
	do
		# GNU tail(1) uses inotify to avoid polling on Linux
		ssh public-inbox.org tail -F /path/to/git-vger.git/info/refs | \
				while read sha1 ref
		do
			for GIT_DIR in git-vger.git
			do
				export GIT_DIR
				git fetch || continue
				git update-server-info
				public-inbox-index # update Xapian index
			done
		done
	done

It's not perfect as it requires multiple processes on the
server, but it's better than polling for my limited use.

> > Currently people resort to polling
> > (https://stackoverflow.com/a/5199111/265521) which is just ugly. I
> > would like to propose that there should be a forth option that uses a
> > persistent connection to monitor the repo. It would be used something
> > like this:
> >
> >     git watch https://github.com/git/git.git
> >
> > or
> >
> >     git watch git@github.com:git/git.git
> >
> > It would then print simple messages to stdout. The complexity of what
> > it prints is up for debate, - it could be something as simple as
> > "PUSH\n", or it could include more information, e.g. JSON-encoded
> > information about the commits. I'd be happy with just "PUSH\n" though.
> 
> Insofar as this could be implemented in some standard way in Git it's
> likely to have a large overlap with the "protocol v2" that keeps coming
> up here on-list. You might want to search for past threads discussing
> that.

Yeah, it hasn't been a priority for me, either...

> > In terms of implementation, the HTTP transport could use Server-Sent
> > Events, and the SSH transport can pretty much do whatever so that
> > should be easy.
> 
> In case you didn't know, any of the non-trivially sized git hosting
> providers (e.g. github, gitlab) provide you access over ssh, but you
> can't just run any arbitrary command, it's a tiny set of whitelisted
> commands. See the "git-shell" manual page (github doesn't use that exact
> software, but something similar).
> 
> But overall, it would be nice to have some rationale for this approach
> other than that you think polling is ugly. There's a lot of advantages
> to polling for something you don't need near-instantly, e.g. imagine how
> many active connections a site like GitHub would need to handle if
> something like this became widely used, that's in a lot of ways harder
> to scale and load balance than just having clients that poll something
> that's trivially cached as static content.

Polling becomes more expensive with TLS and high-latency
connections, and also increases power consumption if done
frequently for redundancy purposes.

I've long wanted to do something better to allow others to keep
public-inbox mirrors up-to-date.  Having only 64-128 bytes of
overhead per userspace per-connection should be totally doable
based on my experience working on cmogstored; at which point
port exhaustion will become the limiting factor (or TLS overhead
for HTTPS).

But perhaps a cheaper option might be the traditional email/IRC
notification and having a client-side process watch for that
before fetching.

  parent reply	other threads:[~2017-06-21 19:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21 14:27 Monitoring a repository for changes Tim Hutt
2017-06-21 15:04 ` Ævar Arnfjörð Bjarmason
2017-06-21 19:44   ` Jeff King
2017-06-21 19:55     ` Stefan Beller
2017-06-21 19:52   ` Eric Wong [this message]
2017-06-21 21:56     ` Ævar Arnfjörð Bjarmason
2017-06-21 22:20       ` Eric Wong
2017-06-21 22:36         ` Eric Wong
2017-06-21 21:19 ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170621195252.GA31582@starla \
    --to=e@80x24.org \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=tdhutt@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.