* How do I get the contents of a directory in fast-import @ 2016-01-01 15:54 Stefan Monnier 2016-01-09 23:56 ` Stefan Monnier 2016-01-15 22:39 ` Jeff King 0 siblings, 2 replies; 4+ messages in thread From: Stefan Monnier @ 2016-01-01 15:54 UTC (permalink / raw) To: git I have a program which tries to collect info from lots of branches and generate some table from that data into another branch. For performance reasons, I'd like to do that from fast-import, and as long as I know the name of all the files I need to consult, everything is fine since I can use the "ls" and "cat-blob" commands of fast-import to get efficiently the data I need. But I also need to look at some files whose names I don't know beforehand (i.e. all the files in some directories). If I do "cat-blob" on those directories I get some binary "thing" which I don't understand. So how do I get a directory listing from fast-inmport, i.e. like I can get with "git cat-file -p", but without having to fork a separate git process? Stefan ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How do I get the contents of a directory in fast-import 2016-01-01 15:54 How do I get the contents of a directory in fast-import Stefan Monnier @ 2016-01-09 23:56 ` Stefan Monnier 2016-01-15 22:39 ` Jeff King 1 sibling, 0 replies; 4+ messages in thread From: Stefan Monnier @ 2016-01-09 23:56 UTC (permalink / raw) To: git Any help would be greatly welcome, including "sorry, can't do that". Stefan >>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes: > I have a program which tries to collect info from lots of branches and > generate some table from that data into another branch. > For performance reasons, I'd like to do that from fast-import, and as > long as I know the name of all the files I need to consult, everything > is fine since I can use the "ls" and "cat-blob" commands of fast-import > to get efficiently the data I need. > But I also need to look at some files whose names I don't know beforehand > (i.e. all the files in some directories). If I do "cat-blob" on those > directories I get some binary "thing" which I don't understand. > So how do I get a directory listing from fast-inmport, i.e. > like I can get with "git cat-file -p", but without having to fork > a separate git process? > Stefan ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How do I get the contents of a directory in fast-import 2016-01-01 15:54 How do I get the contents of a directory in fast-import Stefan Monnier 2016-01-09 23:56 ` Stefan Monnier @ 2016-01-15 22:39 ` Jeff King 2016-01-16 1:59 ` Stefan Monnier 1 sibling, 1 reply; 4+ messages in thread From: Jeff King @ 2016-01-15 22:39 UTC (permalink / raw) To: Stefan Monnier; +Cc: git On Fri, Jan 01, 2016 at 10:54:00AM -0500, Stefan Monnier wrote: > I have a program which tries to collect info from lots of branches and > generate some table from that data into another branch. > > For performance reasons, I'd like to do that from fast-import, and as > long as I know the name of all the files I need to consult, everything > is fine since I can use the "ls" and "cat-blob" commands of fast-import > to get efficiently the data I need. > > But I also need to look at some files whose names I don't know beforehand > (i.e. all the files in some directories). If I do "cat-blob" on those > directories I get some binary "thing" which I don't understand. > > So how do I get a directory listing from fast-inmport, i.e. > like I can get with "git cat-file -p", but without having to fork > a separate git process? I'm not sure I understand your use case exactly, but is the directory listing you want part of the newly-added objects from fast-import, or does it already exist in the branches you are collecting from? If the latter, I wonder if a separate "cat-file --batch" process could give you what you need (it's a separate process, but you can start a single process and make many queries of it; I assume your desire not to add an extra process is to avoid the overhead). But I think it won't pretty-print trees for you; it will give you the raw tree data (which I imagine is what you are getting from cat-blob, too). I'm not sure that's actually documented anywhere (it was part of the original revisions of git, and hasn't changed since). But it is basically: tree = tree_entry* tree_entry = mode SP path NUL sha1 mode = ascii mode, in octal (e.g., "100644") path = <any byte except NUL>* sha1 = <any byte>{20} SP = ascii space (0x20) NUL = 0-byte So it is pretty simple to parse. There may be a better way to do what you want with fast-import. I'm not familiar enough with it to say. -Peff ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How do I get the contents of a directory in fast-import 2016-01-15 22:39 ` Jeff King @ 2016-01-16 1:59 ` Stefan Monnier 0 siblings, 0 replies; 4+ messages in thread From: Stefan Monnier @ 2016-01-16 1:59 UTC (permalink / raw) To: git >> So how do I get a directory listing from fast-import, i.e. >> like I can get with "git cat-file -p", but without having to fork >> a separate git process? > I'm not sure I understand your use case exactly, but is the directory > listing you want part of the newly-added objects from fast-import, or > does it already exist in the branches you are collecting from? For the most important cases, the relevant revision already exists before fast-import, yes. > If the latter, I wonder if a separate "cat-file --batch" process could > give you what you need (it's a separate process, but you can start a I'm not sure exactly how "git cat-file --batch" works internally (whether it tries to keep active revisions, like fast-import does), but I've indeed used it successfully (tho for files). > single process and make many queries of it; I assume your desire not to > add an extra process is to avoid the overhead). The overhead of starting a new process is one part, but another is the overhead of re-reading the refs (I can have tens of thousands of branches in my repository), etc.. > But I think it won't pretty-print trees for you; it will give you the > raw tree data Indeed. > (which I imagine is what you are getting from cat-blob, too). Actually no, "cat-blob" gives an error instead: fatal: Object 2ca1672d50c9dbfe582dc53af3c7ce9891a7a664 is a tree but a blob was expected. > I'm not sure that's actually documented anywhere (it was part of > the original revisions of git, and hasn't changed since). But it is > basically: > tree = tree_entry* > tree_entry = mode SP path NUL sha1 > mode = ascii mode, in octal (e.g., "100644") > path = <any byte except NUL>* > sha1 = <any byte>{20} > SP = ascii space (0x20) > NUL = 0-byte Ah, thanks. It'd be great if cat-blob could return this instead of signalling an error. > So it is pretty simple to parse. My program is written in /bin/sh so parsing the above is actually rather inconvenient, but it's much better than just getting an error. Stefan ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-01-16 2:01 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-01 15:54 How do I get the contents of a directory in fast-import Stefan Monnier 2016-01-09 23:56 ` Stefan Monnier 2016-01-15 22:39 ` Jeff King 2016-01-16 1:59 ` Stefan Monnier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).