git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Eric Wong <e@80x24.org>
Cc: Tim Hutt <tdhutt@gmail.com>, git@vger.kernel.org
Subject: Re: Monitoring a repository for changes
Date: Wed, 21 Jun 2017 23:56:25 +0200	[thread overview]
Message-ID: <87a85180om.fsf@gmail.com> (raw)
In-Reply-To: <20170621195252.GA31582@starla>


On Wed, Jun 21 2017, Eric Wong jotted:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> On Wed, Jun 21 2017, Tim Hutt jotted:
>>
>> > Hi,
>> >
>> > Currently if you want to monitor a repository for changes there are
>> > three options:
>> >
>> > * Polling - run a script to check for updates every 60 seconds.
>> > * Server side hooks
>> > * Web hooks (on Github, Bitbucket etc.)
>> >
>> > Unfortunately for many (most?) cases server-side hooks and web hooks
>> > are not suitable. They require you to both have admin access to the
>> > repo and have a public server available to push updates to. That is a
>> > huge faff when all I want to do is run some local code when a repo is
>> > updated (e.g. play a sound).
>
> Yeah, it kinda sucks that way.
>
> Currently, for one of my public-inbox mirrors which has ssh
> access to the primary server on public-inbox.org, I have:
>
> 	#!/bin/sh
> 	while true
> 	do
> 		# GNU tail(1) uses inotify to avoid polling on Linux
> 		ssh public-inbox.org tail -F /path/to/git-vger.git/info/refs | \
> 				while read sha1 ref
> 		do
> 			for GIT_DIR in git-vger.git
> 			do
> 				export GIT_DIR
> 				git fetch || continue
> 				git update-server-info
> 				public-inbox-index # update Xapian index
> 			done
> 		done
> 	done
>
> It's not perfect as it requires multiple processes on the
> server, but it's better than polling for my limited use.
>
>> > Currently people resort to polling
>> > (https://stackoverflow.com/a/5199111/265521) which is just ugly. I
>> > would like to propose that there should be a forth option that uses a
>> > persistent connection to monitor the repo. It would be used something
>> > like this:
>> >
>> >     git watch https://github.com/git/git.git
>> >
>> > or
>> >
>> >     git watch git@github.com:git/git.git
>> >
>> > It would then print simple messages to stdout. The complexity of what
>> > it prints is up for debate, - it could be something as simple as
>> > "PUSH\n", or it could include more information, e.g. JSON-encoded
>> > information about the commits. I'd be happy with just "PUSH\n" though.
>>
>> Insofar as this could be implemented in some standard way in Git it's
>> likely to have a large overlap with the "protocol v2" that keeps coming
>> up here on-list. You might want to search for past threads discussing
>> that.
>
> Yeah, it hasn't been a priority for me, either...
>
>> > In terms of implementation, the HTTP transport could use Server-Sent
>> > Events, and the SSH transport can pretty much do whatever so that
>> > should be easy.
>>
>> In case you didn't know, any of the non-trivially sized git hosting
>> providers (e.g. github, gitlab) provide you access over ssh, but you
>> can't just run any arbitrary command, it's a tiny set of whitelisted
>> commands. See the "git-shell" manual page (github doesn't use that exact
>> software, but something similar).
>>
>> But overall, it would be nice to have some rationale for this approach
>> other than that you think polling is ugly. There's a lot of advantages
>> to polling for something you don't need near-instantly, e.g. imagine how
>> many active connections a site like GitHub would need to handle if
>> something like this became widely used, that's in a lot of ways harder
>> to scale and load balance than just having clients that poll something
>> that's trivially cached as static content.
>
> Polling becomes more expensive with TLS and high-latency
> connections, and also increases power consumption if done
> frequently for redundancy purposes.
>
> I've long wanted to do something better to allow others to keep
> public-inbox mirrors up-to-date.  Having only 64-128 bytes of
> overhead per userspace per-connection should be totally doable
> based on my experience working on cmogstored; at which point
> port exhaustion will become the limiting factor (or TLS overhead
> for HTTPS).

Come to think of it I should probably have asked you about this, but I
have a one-liner running that polls every 5 minutes, but will stop if I
haven't changed my git.git in a day:

    while true; do if test $(find ~/g/git -type f -mmin -1440 | wc -l) -gt 0; then git pull; else echo too old; fi ; date ; sleep 300; done

> But perhaps a cheaper option might be the traditional email/IRC
> notification and having a client-side process watch for that
> before fetching.

If there was a IRC channel with this info I could/would use that,
getting it via E-Mail would just get me into the same problem
public-inbox is currently solving for me, i.e. I might as well keep the
git ML up-to-date on that machine if I'm going to otherwise need to
subscribe to a "hey there's a new message on the git ML" list :)

  reply	other threads:[~2017-06-21 21:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21 14:27 Monitoring a repository for changes Tim Hutt
2017-06-21 15:04 ` Ævar Arnfjörð Bjarmason
2017-06-21 19:44   ` Jeff King
2017-06-21 19:55     ` Stefan Beller
2017-06-21 19:52   ` Eric Wong
2017-06-21 21:56     ` Ævar Arnfjörð Bjarmason [this message]
2017-06-21 22:20       ` Eric Wong
2017-06-21 22:36         ` Eric Wong
2017-06-21 21:19 ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a85180om.fsf@gmail.com \
    --to=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=tdhutt@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).