From: Brandon Williams <bmwill@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Duy Nguyen <pclouds@gmail.com>,
Stefan Beller <sbeller@google.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS
Date: Wed, 3 May 2017 20:24:15 -0700 [thread overview]
Message-ID: <20170504032415.GA139188@google.com> (raw)
In-Reply-To: <xmqqa86tbamh.fsf@gitster.mtv.corp.google.com>
On 05/03, Junio C Hamano wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
> >>> In the long run we may want to drop the macros guarded by
> >>> NO_THE_INDEX_COMPATIBILITY_MACROS. This converts a couple of them.
> >>
> >> Why?
> >
> > Well.. why did you add NO_THE_INDEX_COMP... in the first place? ;-) j/k
>
> That was to mark the code that implements the underlying machinery
> that the code in a particular file has already been vetted and
> converted to be capable of working with an arbitrary index_state
> instance, not just the_index instance.
>
> Your mentioning "at least outside builtin/" shows that you have a
> good understanding of the issue; they are infrastructure code and
> many of them may become more useful if they are enhanced to take an
> index_state instance instead of always working with the_index. With
> an infrastructure code that insists on working on the_index, an
> application that has something valuable in the_index already needs
> to somehow save it elsewhere before calling the function and use the
> result from the_index before restoring its version.
>
> I think unpack-trees.c used to assume that it is OK to use the_index,
> but after it started taking src/dst_index, it gained NO_THE_INDEX_COMP
> thing. That's the kind of change in this area we would want to see.
>
> > I attempted the same thing once or twice in the past, had the same
> > impression and dropped it. But I think it's good to get rid of cache_*
> > macros, at least outside builtin/ directory.
>
> One thing that needs to be kept in mind is that a mechanical
> replacement of active_cache[] to the_index.cache[] in the
> infrastructure code does not get us closer to such an improvement.
> In fact, such a patch only has negative value (keep reading until
> the end of the paragraph that begins with "One thing is certain" to
> understand why).
>
> The entry point to a codechain for an important infrastructure code
> that can currently only work with the_index needs to be identified.
> There may be many. They need to be made to take an index_state
> instance and to pass it through the callchain. That kind of change
> would be an improvement, and will get us closer to mark the
> resulting code with NO_THE_INDEX_COMP thing.
>
> I think somebody else said this, but I agree that in the longer
> term, we would want to have a "repository" abstraction for in-core
> data structure.
>
> Imagine that a program wants to read into an index_state the tip
> commit of a branch whose name is the same as the current branch in
> one submodule and do something with the objects, while keeping
> the_index for the top-level superproject. Thanks to Michael's and
> your recent work, we are almost there to have "ref-store" that
> represents the set of refs the submodule in question has that a
> "repository" object that represents the submodule can point at. We
> can learn the object name at the tip of a specific ref using that
> infrastructure. Thanks to an ancient change under discussion, we
> can already read the contents of the index file into an instance of
> an index_state that is different from the_index. But after doing
> these operations, we'd need to actually get to the contents of the
> objects the index_state holds. How should that work in the
> "repository object represents an in-core repository" world?
>
> What is missing is a way to represent the object store, so that the
> "repository" object can point at an instance of it (which in turn
> may point at a set of its alternates) [*1*]. With that, perhaps the
> most general kind of the infrastructure API that works with an
> index_state instance may additionally take a repository object, or
> perhaps an index_state may have a pointer to the repository where
> the objects listed in it must be taken from (in which case the
> function may take only an index_state). In the end the enhanced
> function has to allow us to read from the object store of the
> submodule, not from the superproject's.
>
> Depending on how the design of the repository object goes, the
> interface to these functions may have to change.
>
> But one thing is certain. Many users of the API, just like a lot of
> builtin/ code works only with the_index today, will not have to work
> with a non-default repository or a non-default index_state. Only
> the more advanced and complex "multi-repo" Porcelains would have to
> care. Keeping active_cache[], active_nr, etc. abstraction would
> isolate the simpler callers from the inevitable code churn we would
> need while getting the "repository" abstraction right. This is why
> I see that you understood the issues well when you said "builtiln/".
>
> If we want to have the "repository" abstraction, the "object store"
> is the next thing we should be working on, and such a work can take
> inspiration from how the old <active_cache[], active_nr,
> active_alloc, active_cache_changed and *active_cache_tree> tuple was
> turned into an "index" object while allowing the majority of code
> that want to deal with the singleton instance keep using the old
> names via CPP macros.
>
> Also, dropping compatibility macros at the end of the series is
> unfriendly to fellow developers with WIPs that depend on the
> traditional way of doing things. It is doubly insulting to them to
> send such a series during the feature freeze period, when it is more
> likely that they are holding off their WIP in an attempt to allow us
> to concentrate more on what we should be, i.e. finding and fixing
> possible regressions at the tip of 'master' to polish the upcoming
> release and help the end users.
>
>
> [Footnote]
>
> *1* Instead we have an ugly hack that lets the singleton object
> store add objects from the submodule object store, and it does not
> allow us to back out once we did so (unlike the "ref-store" thing
> that we could). When we need to work with a project with many
> submodules, we should be able to say "OK, we are done with this
> thing for now, call repository_clear() to reclaim the resources we
> used to deal with it".
>
Thanks for the detailed explanation on what a possible transition to
having a repository object would look like. While digging through the
code it does seem like the object store is one of the main things
holding the code base back from having a repository object.
Thank being said, I spent today hacking away at 'ls-files' in order to
see if it would be feasible to convert it to using a repository object
for the recursive submodule case instead of using run-command. I picked
'ls-files' mainly because the majority of the code paths don't touch the
object store and mainly deal with manipulating the index. Despite the
hack-y, cobbled together state that it is in currently it seems very
feasible goal.
--
Brandon Williams
next prev parent reply other threads:[~2017-05-04 3:24 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-01 19:07 [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS Stefan Beller
2017-05-01 19:07 ` [PATCH 1/5] cache.h: drop read_cache() Stefan Beller
2017-05-01 19:07 ` [PATCH 2/5] cache.h: drop active_* macros Stefan Beller
2017-05-01 19:07 ` [PATCH 3/5] cache.h: drop read_cache_from Stefan Beller
2017-05-01 19:07 ` [PATCH 4/5] cache.h: drop read_cache_preload(pathspec) Stefan Beller
2017-05-01 19:07 ` [PATCH 5/5] cache.h: drop read_cache_unmerged() Stefan Beller
2017-05-02 1:36 ` [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS Junio C Hamano
2017-05-02 4:17 ` Stefan Beller
2017-05-02 14:05 ` Jeff Hostetler
2017-05-03 11:31 ` Samuel Lijin
2017-05-03 17:14 ` Stefan Beller
2017-05-03 18:22 ` Samuel Lijin
2017-05-04 3:29 ` Brandon Williams
2017-05-03 10:27 ` Duy Nguyen
2017-05-03 17:02 ` Stefan Beller
2017-05-04 2:48 ` Junio C Hamano
2017-05-04 3:24 ` Brandon Williams [this message]
2017-05-04 18:30 ` Stefan Beller
2017-05-05 14:31 ` Johannes Schindelin
2017-05-05 17:20 ` Brandon Williams
2017-05-04 19:19 ` Jonathan Nieder
2017-05-05 17:22 ` Junio C Hamano
2017-05-05 17:29 ` Brandon Williams
2017-05-02 15:35 ` Jeff Hostetler
2017-05-02 17:06 ` Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170504032415.GA139188@google.com \
--to=bmwill@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).