* Unused #include statements @ 2015-01-15 3:43 Zoltan Klinger 2015-01-15 4:14 ` Robert Schiele 0 siblings, 1 reply; 8+ messages in thread From: Zoltan Klinger @ 2015-01-15 3:43 UTC (permalink / raw) To: GIT Mailing-list Hello there, Since reading a post [1] about removing some unnecessary #include statements from a git C source file I've been intrigued to see how many more might be lurking in the code base. After a bit of digging around, my brute force approach of 'remove as many #includes as possible while making sure the code still successfully compiles' has returned the following results: $ git diff --stat alloc.c | 2 -- archive-zip.c | 1 - archive.c | 1 - argv-array.c | 1 - bisect.c | 9 --------- block-sha1/sha1.c | 2 -- branch.c | 1 - builtin/add.c | 7 ------- builtin/annotate.c | 1 - builtin/apply.c | 8 -------- builtin/archive.c | 3 --- builtin/bisect--helper.c | 2 -- builtin/blame.c | 13 ------------- builtin/branch.c | 8 -------- builtin/bundle.c | 1 - builtin/cat-file.c | 3 --- builtin/check-attr.c | 2 -- builtin/check-ignore.c | 2 -- builtin/check-mailmap.c | 2 -- builtin/check-ref-format.c | 2 -- builtin/checkout-index.c | 2 -- builtin/checkout.c | 11 ----------- builtin/clean.c | 3 --- builtin/clone.c | 10 ---------- builtin/column.c | 3 --- builtin/commit-tree.c | 4 ---- builtin/commit.c | 13 ------------- builtin/config.c | 1 - builtin/count-objects.c | 2 -- builtin/credential.c | 1 - builtin/describe.c | 5 ----- builtin/diff-files.c | 4 ---- builtin/diff-index.c | 4 ---- builtin/diff-tree.c | 4 ---- builtin/diff.c | 6 ------ builtin/fast-export.c | 8 -------- builtin/fetch.c | 8 -------- builtin/fmt-merge-msg.c | 6 ------ builtin/for-each-ref.c | 6 ------ builtin/fsck.c | 7 ------- builtin/gc.c | 3 --- builtin/get-tar-commit-id.c | 3 --- builtin/grep.c | 6 ------ builtin/hash-object.c | 2 -- builtin/help.c | 2 -- builtin/index-pack.c | 5 ----- builtin/init-db.c | 1 - builtin/interpret-trailers.c | 3 --- builtin/log.c | 14 -------------- builtin/ls-files.c | 3 --- builtin/ls-remote.c | 3 --- builtin/ls-tree.c | 3 --- builtin/mailinfo.c | 2 -- builtin/mailsplit.c | 3 --- builtin/merge-base.c | 5 ----- builtin/merge-file.c | 2 -- builtin/merge-index.c | 1 - builtin/merge-ours.c | 1 - builtin/merge-recursive.c | 3 --- builtin/merge-tree.c | 2 -- builtin/merge.c | 10 ---------- builtin/mktree.c | 2 -- builtin/mv.c | 4 ---- builtin/name-rev.c | 3 --- builtin/notes.c | 4 ---- builtin/pack-objects.c | 14 -------------- builtin/prune-packed.c | 1 - builtin/prune.c | 5 ----- builtin/push.c | 6 ------ builtin/read-tree.c | 4 ---- builtin/receive-pack.c | 12 ------------ builtin/reflog.c | 5 ----- builtin/remote-ext.c | 1 - builtin/remote-fd.c | 1 - builtin/remote.c | 6 ------ builtin/repack.c | 6 ------ builtin/replace.c | 1 - builtin/rerere.c | 4 ---- builtin/reset.c | 7 ------- builtin/rev-list.c | 7 ------- builtin/rev-parse.c | 5 ----- builtin/revert.c | 4 ---- builtin/rm.c | 4 ---- builtin/send-pack.c | 7 ------- builtin/shortlog.c | 7 ------- builtin/show-branch.c | 3 --- builtin/show-ref.c | 5 ----- builtin/stripspace.c | 1 - builtin/symbolic-ref.c | 1 - builtin/tag.c | 4 ---- builtin/unpack-objects.c | 7 ------- builtin/update-index.c | 7 ------- builtin/update-ref.c | 3 --- builtin/update-server-info.c | 1 - builtin/upload-archive.c | 5 ----- builtin/verify-commit.c | 5 ----- builtin/verify-pack.c | 1 - builtin/verify-tag.c | 5 ----- builtin/write-tree.c | 2 -- bulk-checkin.c | 3 --- bundle.c | 5 ----- cache-tree.c | 2 -- check-racy.c | 1 - column.c | 2 -- combine-diff.c | 6 ------ commit.c | 7 ------- compat/basename.c | 1 - compat/fopen.c | 1 - compat/gmtime.c | 1 - compat/hstrerror.c | 3 --- compat/inet_ntop.c | 1 - compat/inet_pton.c | 1 - compat/mingw.c | 7 ------- compat/mkdir.c | 1 - compat/mkdtemp.c | 1 - compat/mmap.c | 1 - compat/msvc.c | 5 ----- compat/nedmalloc/nedmalloc.c | 2 -- compat/obstack.c | 1 - compat/poll/poll.c | 6 ------ compat/pread.c | 1 - compat/precompose_utf8.c | 1 - compat/qsort.c | 1 - compat/regex/regex.c | 2 -- compat/setenv.c | 1 - compat/snprintf.c | 1 - compat/strcasestr.c | 1 - compat/strlcpy.c | 1 - compat/strtoimax.c | 1 - compat/strtoumax.c | 1 - compat/terminal.c | 2 -- compat/unsetenv.c | 1 - compat/win32/dirent.c | 1 - compat/win32/pthread.c | 4 ---- compat/win32/syslog.c | 1 - compat/win32mmap.c | 1 - compat/winansi.c | 3 --- config.c | 4 ---- connect.c | 4 ---- connected.c | 2 -- contrib/convert-objects/convert-objects.c | 4 ---- .../gnome-keyring/git-credential-gnome-keyring.c | 6 ------ .../credential/osxkeychain/git-credential-osxkeychain.c | 4 ---- contrib/credential/wincred/git-credential-wincred.c | 4 ---- contrib/examples/builtin-fetch--tool.c | 5 ----- contrib/svn-fe/svn-fe.c | 2 -- convert.c | 2 -- credential-cache--daemon.c | 2 -- credential-cache.c | 3 --- credential-store.c | 1 - credential.c | 1 - csum-file.c | 1 - daemon.c | 3 --- diff-delta.c | 1 - diff-lib.c | 5 ----- 155 files changed, 562 deletions(-) So my questions are as follows: (1) Is it worth turning this into a proper patch? (2) Given the large number of files (150+) what would be the best approach in preparing the patch? (a) One commit containing all the changes? Would it be a bit too much to digest in one go? (b) One commit per file changed? Feels a bit over the top also it would flood the mailing list. (c) One commit each for changes in the root directory, builtin, compat and contrib directories? Thanks, Zoltan [1] http://article.gmane.org/gmane.comp.version-control.git/262402 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Unused #include statements 2015-01-15 3:43 Unused #include statements Zoltan Klinger @ 2015-01-15 4:14 ` Robert Schiele 2015-01-15 6:33 ` Jeff King 0 siblings, 1 reply; 8+ messages in thread From: Robert Schiele @ 2015-01-15 4:14 UTC (permalink / raw) To: Zoltan Klinger; +Cc: GIT Mailing-list Hi Zoltan, I can't make a statement for the git project but I consider this kind of brute-force removal a very problematic approach for languages like C and C++. The reason for that is simple: Often header files include other header files since their content depends on those other header files. Let's assume a header file a.h includes b.h. Now consider a file c.c includes both a.h and b.h since they are actually using stuff from both of them. Your brute-force approach would now remove b.h since it is indirectly pulled in through a.h. While the removal would no lead to technially incorrect code it would no longer reflect the semantical situation (since c.c still uses stuff from b.h). Even worse is that if you later modify c.c to no longer use stuff from a.h you could no longer remove a.h since that way you would break the chain to b.h. Thus doing those kind of brute-force removals generally makes the include structure in a project very fragile. The analysis itself you did is still useful to identify header files that can potentially be removed but removing them without further analysis I would consider problematic. Robert ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Unused #include statements 2015-01-15 4:14 ` Robert Schiele @ 2015-01-15 6:33 ` Jeff King 2015-01-15 18:50 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Jeff King @ 2015-01-15 6:33 UTC (permalink / raw) To: Robert Schiele; +Cc: Zoltan Klinger, GIT Mailing-list On Thu, Jan 15, 2015 at 05:14:39AM +0100, Robert Schiele wrote: > Thus doing those kind of brute-force removals generally makes the > include structure in a project very fragile. The analysis itself you > did is still useful to identify header files that can potentially be > removed but removing them without further analysis I would consider > problematic. I would second that. Besides leading to a potentially fragile result, this analysis was done only for a particular platform with a particular set of config knobs. One of our rules is that git-compat-util.h (or one of the well-known headers which includes, cache.h or builtin.h) is included first in any translation unit. This gives git-compat-util the cleanest environment possible for making decisions, and lets macros it defines effect the rest of the code consistently. I suspect on modern platforms like Linux/glibc that it is not a huge deal to include git-compat-util a little late, simply because it does not have all that much to do. But on Solaris 8? Who knows. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Unused #include statements 2015-01-15 6:33 ` Jeff King @ 2015-01-15 18:50 ` Junio C Hamano 2015-01-15 22:38 ` Jeff King 2015-01-20 2:08 ` Zoltan Klinger 0 siblings, 2 replies; 8+ messages in thread From: Junio C Hamano @ 2015-01-15 18:50 UTC (permalink / raw) To: Jeff King; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list Jeff King <peff@peff.net> writes: > One of our rules is that git-compat-util.h (or one of the well-known > headers which includes, cache.h or builtin.h) is included first in any > translation unit. Perhaps by now we can spell it out a bit more explicitly and update CodingGuidelines. We have: - The first #include in C files, except in platform specific compat/ implementations, should be git-compat-util.h or another header file that includes it, such as cache.h or builtin.h. "such as" might be making things unnecessarily vague; I do not think a valid reason why we should say a .c file that includes "advice.h" as the first thing satisfies this requirement, for example. Because: - A command that interacts with the object store, config subsystem, the index, or the working tree cannot do anything without using what is declared in "cache.h". - A built-in command must be declared in "builtin.h", so anything in builtin/*.c must include it. it may be reasonable to say the first *.h file included must be one of git-compat-util.h, cache.h or builtin.h (and then we make sure that compat-util is the first thing included in either of the latter two). While I very much agree with the principle Robert alluded to [*1*], we may want to loosen that for .c files that include "builtin.h" or "cache.h" for the sake of brevity. For example, if you are builtin (hence you start by #include "builtin.h"), it may be reasonable to allow you to take whatever is in "cache.h" for granted [*2*]. So the rule might be: - The first #include in C files, except in platform specific compat/ implementations, must be either git-compat-util.h, cache.h or builtin.h. - A C file must directly include the header files that declare the functions and the types it uses, except for the functions and types that are made available to it by including one of the header files it must include by the previous rule. Optionally, - A C file must include only one of "git-compat-util.h", "cache.h" or "builtin.h"; e.g. if you include "builtin.h", do not include the other two, but it can consider what is availble in "cache.h" available to it. Thoughts? I am not looking forward to a torrent of patches whose sole purpose is to make the existing C files conform to any such rule, though. Clean-up patches that trickle in at a low rate is tolerable, but a torrent is too distracting. [Footnote] *1* For example, even though "diff.h" may include "tree-walk.h" for its own use, a .c file that includes "diff.h" without including "tree-walk.h" that uses update_tree_entry() or anything that is declared in the latter is very iffy from semantic point of view. *2* Because that facility is so widely used inside the codebase, "builtin.h" includes "strbuf.h", so in addition to what are in "cache.h", we may want to allow builtin implementations to take strbufs for granted as well. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Unused #include statements 2015-01-15 18:50 ` Junio C Hamano @ 2015-01-15 22:38 ` Jeff King 2015-01-15 23:20 ` Junio C Hamano 2015-01-20 2:08 ` Zoltan Klinger 1 sibling, 1 reply; 8+ messages in thread From: Jeff King @ 2015-01-15 22:38 UTC (permalink / raw) To: Junio C Hamano; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list On Thu, Jan 15, 2015 at 10:50:45AM -0800, Junio C Hamano wrote: > So the rule might be: > > - The first #include in C files, except in platform specific > compat/ implementations, must be either git-compat-util.h, > cache.h or builtin.h. > > - A C file must directly include the header files that declare the > functions and the types it uses, except for the functions and > types that are made available to it by including one of the > header files it must include by the previous rule. Yeah, that makes sense (and is what I took away from the existing rule in CodingGuidelines, but I agree what is there is not very rigorous). > Optionally, > > - A C file must include only one of "git-compat-util.h", "cache.h" > or "builtin.h"; e.g. if you include "builtin.h", do not include > the other two, but it can consider what is availble in "cache.h" > available to it. > > Thoughts? I am not looking forward to a torrent of patches whose > sole purpose is to make the existing C files conform to any such > rule, though. Clean-up patches that trickle in at a low rate is > tolerable, but a torrent is too distracting. I don't think the "optionally" one above is that necessary. Not because I don't agree with it, but because I do not know that we want to get into the business of laying out every minute detail and implication. The CodingGuidelines document is meant to be guidelines, and I do not want to see arguments like "well, the guidelines do not explicitly _disallow_ this, so you must accept it or add something to the guideline". That is a waste of everybody's time. A general philosophy + good taste (from the submitter and the maintainer) should ideally be enough. And hopefully would stop a torrent of "but this file doesn't conform to the letter of CodingGuidelines!". Maybe it does not, but if there is no tangible benefit besides blindly following some rules, it is not worth the precious time of developers. Which isn't to say we shouldn't clarify the document when need be. But I think what I quoted at the top already is probably a good improvement over what is there. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Unused #include statements 2015-01-15 22:38 ` Jeff King @ 2015-01-15 23:20 ` Junio C Hamano 2015-01-16 0:00 ` Jeff King 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2015-01-15 23:20 UTC (permalink / raw) To: Jeff King; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list Jeff King <peff@peff.net> writes: > On Thu, Jan 15, 2015 at 10:50:45AM -0800, Junio C Hamano wrote: >> ... >> Thoughts? I am not looking forward to a torrent of patches whose >> sole purpose is to make the existing C files conform to any such >> rule, though. Clean-up patches that trickle in at a low rate is >> tolerable, but a torrent is too distracting. > > I don't think the "optionally" one above is that necessary. Not because > I don't agree with it, but because I do not know that we want to get > into the business of laying out every minute detail and implication. > The CodingGuidelines document is meant to be guidelines, and I do not > want to see arguments like "well, the guidelines do not explicitly > _disallow_ this, so you must accept it or add something to the > guideline". That is a waste of everybody's time. Totally. I know we do not want to get into that business. > A general philosophy + good taste (from the submitter and the > maintainer) should ideally be enough. Yes, "ideally" ;-) > Which isn't to say we shouldn't clarify the document when need be. But I > think what I quoted at the top already is probably a good improvement > over what is there. OK, thanks. Let's queue something like this for post 2.3 cycle, then. -- >8 -- Subject: CodingGuidelines: clarify C #include rules Even though "advice.h" includes "git-compat-util.h", it is not sensible to have it as the first #include and indirectly satisify the "You must give git-compat-util.h a clean environment to set up feature test macros before including any of the system headers are included", which is the real requirement. Because: - A command that interacts with the object store, config subsystem, the index, or the working tree cannot do anything without using what is declared in "cache.h"; - A built-in command must be declared in "builtin.h", so anything in builtin/*.c must include it; - These two headers both include "git-compat-util.h" as the first thing; and - Almost all our *.c files (outside compat/ and borrowed files in xdiff/) need some Git-ness from "cache.h" to do something Git-ish. let's explicitly specify that one of these three header files must be the first thing that is included. Any of our *.c file should include the header file that directly declares what it uses, instead of relying on the fact that some *.h file it includes happens to include another *.h file that declares the necessary function or type. Spell it out as another guideline item. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/CodingGuidelines | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 894546d..578d07c 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -328,9 +328,14 @@ For C programs: - When you come up with an API, document it. - - The first #include in C files, except in platform specific - compat/ implementations, should be git-compat-util.h or another - header file that includes it, such as cache.h or builtin.h. + - The first #include in C files, except in platform specific compat/ + implementations, must be either "git-compat-util.h", "cache.h" or + "builtin.h". You do not have to include more than one of these. + + - A C file must directly include the header files that declare the + functions and the types it uses, except for the functions and types + that are made available to it by including one of the header files + it must include by the previous rule. - If you are planning a new command, consider writing it in shell or perl first, so that changes in semantics can be easily ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Unused #include statements 2015-01-15 23:20 ` Junio C Hamano @ 2015-01-16 0:00 ` Jeff King 0 siblings, 0 replies; 8+ messages in thread From: Jeff King @ 2015-01-16 0:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list On Thu, Jan 15, 2015 at 03:20:09PM -0800, Junio C Hamano wrote: > OK, thanks. Let's queue something like this for post 2.3 cycle, > then. > > -- >8 -- > Subject: CodingGuidelines: clarify C #include rules > [...] Thanks, this looks good to me. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Unused #include statements 2015-01-15 18:50 ` Junio C Hamano 2015-01-15 22:38 ` Jeff King @ 2015-01-20 2:08 ` Zoltan Klinger 1 sibling, 0 replies; 8+ messages in thread From: Zoltan Klinger @ 2015-01-20 2:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Robert Schiele, GIT Mailing-list Robert, Peff and Junio. Thank you all for your feedback. It's clear now what sort of analysis I should aim towards. Thanks, Zoltan ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-01-20 2:08 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-15 3:43 Unused #include statements Zoltan Klinger 2015-01-15 4:14 ` Robert Schiele 2015-01-15 6:33 ` Jeff King 2015-01-15 18:50 ` Junio C Hamano 2015-01-15 22:38 ` Jeff King 2015-01-15 23:20 ` Junio C Hamano 2015-01-16 0:00 ` Jeff King 2015-01-20 2:08 ` Zoltan Klinger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).