All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Eric Wong <e@80x24.org>
Cc: Tim Hutt <tdhutt@gmail.com>, git@vger.kernel.org
Subject: Re: Monitoring a repository for changes
Date: Wed, 21 Jun 2017 23:56:25 +0200	[thread overview]
Message-ID: <87a85180om.fsf@gmail.com> (raw)
In-Reply-To: <20170621195252.GA31582@starla>


On Wed, Jun 21 2017, Eric Wong jotted:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> On Wed, Jun 21 2017, Tim Hutt jotted:
>>
>> > Hi,
>> >
>> > Currently if you want to monitor a repository for changes there are
>> > three options:
>> >
>> > * Polling - run a script to check for updates every 60 seconds.
>> > * Server side hooks
>> > * Web hooks (on Github, Bitbucket etc.)
>> >
>> > Unfortunately for many (most?) cases server-side hooks and web hooks
>> > are not suitable. They require you to both have admin access to the
>> > repo and have a public server available to push updates to. That is a
>> > huge faff when all I want to do is run some local code when a repo is
>> > updated (e.g. play a sound).
>
> Yeah, it kinda sucks that way.
>
> Currently, for one of my public-inbox mirrors which has ssh
> access to the primary server on public-inbox.org, I have:
>
> 	#!/bin/sh
> 	while true
> 	do
> 		# GNU tail(1) uses inotify to avoid polling on Linux
> 		ssh public-inbox.org tail -F /path/to/git-vger.git/info/refs | \
> 				while read sha1 ref
> 		do
> 			for GIT_DIR in git-vger.git
> 			do
> 				export GIT_DIR
> 				git fetch || continue
> 				git update-server-info
> 				public-inbox-index # update Xapian index
> 			done
> 		done
> 	done
>
> It's not perfect as it requires multiple processes on the
> server, but it's better than polling for my limited use.
>
>> > Currently people resort to polling
>> > (https://stackoverflow.com/a/5199111/265521) which is just ugly. I
>> > would like to propose that there should be a forth option that uses a
>> > persistent connection to monitor the repo. It would be used something
>> > like this:
>> >
>> >     git watch https://github.com/git/git.git
>> >
>> > or
>> >
>> >     git watch git@github.com:git/git.git
>> >
>> > It would then print simple messages to stdout. The complexity of what
>> > it prints is up for debate, - it could be something as simple as
>> > "PUSH\n", or it could include more information, e.g. JSON-encoded
>> > information about the commits. I'd be happy with just "PUSH\n" though.
>>
>> Insofar as this could be implemented in some standard way in Git it's
>> likely to have a large overlap with the "protocol v2" that keeps coming
>> up here on-list. You might want to search for past threads discussing
>> that.
>
> Yeah, it hasn't been a priority for me, either...
>
>> > In terms of implementation, the HTTP transport could use Server-Sent
>> > Events, and the SSH transport can pretty much do whatever so that
>> > should be easy.
>>
>> In case you didn't know, any of the non-trivially sized git hosting
>> providers (e.g. github, gitlab) provide you access over ssh, but you
>> can't just run any arbitrary command, it's a tiny set of whitelisted
>> commands. See the "git-shell" manual page (github doesn't use that exact
>> software, but something similar).
>>
>> But overall, it would be nice to have some rationale for this approach
>> other than that you think polling is ugly. There's a lot of advantages
>> to polling for something you don't need near-instantly, e.g. imagine how
>> many active connections a site like GitHub would need to handle if
>> something like this became widely used, that's in a lot of ways harder
>> to scale and load balance than just having clients that poll something
>> that's trivially cached as static content.
>
> Polling becomes more expensive with TLS and high-latency
> connections, and also increases power consumption if done
> frequently for redundancy purposes.
>
> I've long wanted to do something better to allow others to keep
> public-inbox mirrors up-to-date.  Having only 64-128 bytes of
> overhead per userspace per-connection should be totally doable
> based on my experience working on cmogstored; at which point
> port exhaustion will become the limiting factor (or TLS overhead
> for HTTPS).

Come to think of it I should probably have asked you about this, but I
have a one-liner running that polls every 5 minutes, but will stop if I
haven't changed my git.git in a day:

    while true; do if test $(find ~/g/git -type f -mmin -1440 | wc -l) -gt 0; then git pull; else echo too old; fi ; date ; sleep 300; done

> But perhaps a cheaper option might be the traditional email/IRC
> notification and having a client-side process watch for that
> before fetching.

If there was a IRC channel with this info I could/would use that,
getting it via E-Mail would just get me into the same problem
public-inbox is currently solving for me, i.e. I might as well keep the
git ML up-to-date on that machine if I'm going to otherwise need to
subscribe to a "hey there's a new message on the git ML" list :)

  reply	other threads:[~2017-06-21 21:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21 14:27 Monitoring a repository for changes Tim Hutt
2017-06-21 15:04 ` Ævar Arnfjörð Bjarmason
2017-06-21 19:44   ` Jeff King
2017-06-21 19:55     ` Stefan Beller
2017-06-21 19:52   ` Eric Wong
2017-06-21 21:56     ` Ævar Arnfjörð Bjarmason [this message]
2017-06-21 22:20       ` Eric Wong
2017-06-21 22:36         ` Eric Wong
2017-06-21 21:19 ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a85180om.fsf@gmail.com \
    --to=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=tdhutt@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.