git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* A Python script to put CTAN into git (from DVDs)
@ 2011-11-06 15:17 Jonathan Fine
  2011-11-06 16:42 ` Jakub Narebski
       [not found] ` <mailman.2464.1320597747.27778.python-list@python.org>
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Fine @ 2011-11-06 15:17 UTC (permalink / raw)
  To: python-list; +Cc: git

Hi

This it to let you know that I'm writing (in Python) a script that 
places the content of CTAN into a git repository.
     https://bitbucket.org/jfine/python-ctantools

I'm working from the TeX Collection DVDs that are published each year by 
the TeX user groups, which contain a snapshot of CTAN (about 100,000 
files occupying 4Gb), which means I have to unzip folders and do a few 
other things.

CTAN is the Comprehensive TeX Archive Network.  CTAN keeps only the 
latest version of each file, but old CTAN snapshots will provide many 
earlier versions.

I'm working on putting old CTAN files into modern version control. 
Martin Scharrer is working in the other direction.  He's putting new 
files added to CTAN into Mercurial.
     http://ctanhg.scharrer-online.de/

My script works already as a proof of concept, but needs more work (and 
documentation) before it becomes useful.  I've requested that follow up 
goes to comp.text.tex.

Longer terms goals are git as
* http://en.wikipedia.org/wiki/Content-addressable_storage
* a resource editing and linking system

If you didn't know, a git tree is much like an immutable JSON object, 
except that it does not have arrays or numbers.

If my project interests you, reply to this message or contact me 
directly (or both).

-- 
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-11-07 22:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-06 15:17 A Python script to put CTAN into git (from DVDs) Jonathan Fine
2011-11-06 16:42 ` Jakub Narebski
     [not found] ` <mailman.2464.1320597747.27778.python-list@python.org>
2011-11-06 18:19   ` Jonathan Fine
2011-11-06 20:29     ` Jakub Narebski
2011-11-07 20:21       ` Jonathan Fine
2011-11-07 21:50         ` Jakub Narebski
2011-11-07 22:03           ` Jonathan Fine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).