git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Documentation: some coding guideline updates
@ 2024-07-24 11:05 Patrick Steinhardt
  2024-07-24 11:05 ` [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives Patrick Steinhardt
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 11:05 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 586 bytes --]

Hi,

I've had it in my mind for a long time to update our coding guideline
for some things that I frequently stumble over during code reviews. This
small patch series fills in those gaps.

Thanks!

Patrick

Patrick Steinhardt (3):
  Documentation: clarify indentation style for C preprocessor directives
  Documentation: document naming schema for struct-related functions
  Documentation: document difference between release and free

 Documentation/CodingGuidelines | 41 ++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

-- 
2.46.0.rc1.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives
  2024-07-24 11:05 [PATCH 0/3] Documentation: some coding guideline updates Patrick Steinhardt
@ 2024-07-24 11:05 ` Patrick Steinhardt
  2024-07-24 16:41   ` Junio C Hamano
  2024-07-24 11:05 ` [PATCH 2/3] Documentation: document naming schema for struct-related functions Patrick Steinhardt
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 11:05 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 1156 bytes --]

There has recently been some discussion around how C preprocessor
directives shall be indented [1]. This discussion was settled towards
indenting after the hash by two spaces [2]. Document it as such.

[1]: https://lore.kernel.org/all/xmqqwmmm1bw6.fsf@gitster.g/
[2]: https://lore.kernel.org/all/20240708092317.267915-2-karthik.188@gmail.com/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/CodingGuidelines | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 1d92b2da03..1d09384f28 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -214,6 +214,16 @@ For C programs:
  - We use tabs to indent, and interpret tabs as taking up to
    8 spaces.
 
+ - Nested C preprocessor directives are indented after the hash by two
+   spaces per nesting level.
+
+	#if FOO
+	#  include <foo.h>
+	#  if BAR
+	#    include <bar.h>
+	#  endif
+	#endif
+
  - We try to keep to at most 80 characters per line.
 
  - As a Git developer we assume you have a reasonably modern compiler
-- 
2.46.0.rc1.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/3] Documentation: document naming schema for struct-related functions
  2024-07-24 11:05 [PATCH 0/3] Documentation: some coding guideline updates Patrick Steinhardt
  2024-07-24 11:05 ` [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives Patrick Steinhardt
@ 2024-07-24 11:05 ` Patrick Steinhardt
  2024-07-24 11:42   ` Karthik Nayak
  2024-07-24 16:50   ` Junio C Hamano
  2024-07-24 11:05 ` [PATCH 3/3] Documentation: document difference between release and free Patrick Steinhardt
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 11:05 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 1757 bytes --]

We nowadays have a proper mishmash of struct-related functions that are
called `<verb>_<struct>` (e.g. `clear_prio_queue()`) versus functions
that are called `<struct>_<verb>` (e.g. `strbuf_clear()`). While the
former style may be easier to tie into a spoken conversation, most of
our communication happens in text anyway. Furthermore, prefixing
functions with the name of the structure they operate on makes it way
easier to group them together, see which functions are related, and will
also help folks who are using code completion.

Let's thus settle on one style, namely the one where functions start
with the name of the structure they operate on.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/CodingGuidelines | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 1d09384f28..34fcbcb5a4 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -541,6 +541,25 @@ For C programs:
    use your own debugger and arguments. Example: `GIT_DEBUGGER="ddd --gdb"
    ./bin-wrappers/git log` (See `wrap-for-bin.sh`.)
 
+ - Functions that operate on a specific structure and which are used by
+   other subsystems shall be named after the structure. The function
+   name should start with the name of the structure followed by a verb.
+   E.g.
+
+	struct strbuf;
+
+	void strbuf_add(struct strbuf *buf, ...);
+
+	void strbuf_reset(struct strbuf *buf);
+
+    is preferred over:
+
+	struct strbuf;
+
+	void add_string(struct strbuf *buf, ...);
+
+	void reset_strbuf(struct strbuf *buf);
+
 For Perl programs:
 
  - Most of the C guidelines above apply.
-- 
2.46.0.rc1.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 11:05 [PATCH 0/3] Documentation: some coding guideline updates Patrick Steinhardt
  2024-07-24 11:05 ` [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives Patrick Steinhardt
  2024-07-24 11:05 ` [PATCH 2/3] Documentation: document naming schema for struct-related functions Patrick Steinhardt
@ 2024-07-24 11:05 ` Patrick Steinhardt
  2024-07-24 11:46   ` Karthik Nayak
  2024-07-24 16:52   ` Junio C Hamano
  2024-07-24 11:47 ` [PATCH 0/3] Documentation: some coding guideline updates Karthik Nayak
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
  4 siblings, 2 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 11:05 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 1481 bytes --]

We semi-regularly have discussions around whether a function shall be
named `release()` or `free()`. For most of the part we use these two
terminologies quite consistently though:

  - `release()` only frees internal state of a structure, whereas the
    structure itself is not free'd.

  - `free()` frees both internal state and the structure itself.

Carve out a space where we can add idiomatic names for common functions
in our coding guidelines. This space can get extended in the future when
we feel the need to document more idiomatic names.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/CodingGuidelines | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 34fcbcb5a4..ace4c4ad0c 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -560,6 +560,18 @@ For C programs:
 
 	void reset_strbuf(struct strbuf *buf);
 
+ - There are several common idiomatic names for functions performing
+   specific tasks on structures:
+
+    - `<struct>_init()` initializes a structure without allocating the
+      structure itself.
+
+    - `<struct>_release()` releases a structure's contents without
+      freeing the structure.
+
+    - `<struct>_free()` releases a structure's contents and frees the
+      structure.
+
 For Perl programs:
 
  - Most of the C guidelines above apply.
-- 
2.46.0.rc1.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] Documentation: document naming schema for struct-related functions
  2024-07-24 11:05 ` [PATCH 2/3] Documentation: document naming schema for struct-related functions Patrick Steinhardt
@ 2024-07-24 11:42   ` Karthik Nayak
  2024-07-24 13:12     ` Patrick Steinhardt
  2024-07-24 16:50   ` Junio C Hamano
  1 sibling, 1 reply; 29+ messages in thread
From: Karthik Nayak @ 2024-07-24 11:42 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 2219 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> We nowadays have a proper mishmash of struct-related functions that are
> called `<verb>_<struct>` (e.g. `clear_prio_queue()`) versus functions
> that are called `<struct>_<verb>` (e.g. `strbuf_clear()`). While the
> former style may be easier to tie into a spoken conversation, most of
> our communication happens in text anyway. Furthermore, prefixing
> functions with the name of the structure they operate on makes it way
> easier to group them together, see which functions are related, and will
> also help folks who are using code completion.
>
> Let's thus settle on one style, namely the one where functions start
> with the name of the structure they operate on.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/CodingGuidelines | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 1d09384f28..34fcbcb5a4 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -541,6 +541,25 @@ For C programs:
>     use your own debugger and arguments. Example: `GIT_DEBUGGER="ddd --gdb"
>     ./bin-wrappers/git log` (See `wrap-for-bin.sh`.)
>
> + - Functions that operate on a specific structure and which are used by
> +   other subsystems shall be named after the structure. The function
> +   name should start with the name of the structure followed by a verb.
> +   E.g.
> +

Nit: It would be nice to add `<struct>_<verb>` here, since we do something
similar in the next patch.

> +	struct strbuf;
> +
> +	void strbuf_add(struct strbuf *buf, ...);
> +
> +	void strbuf_reset(struct strbuf *buf);
> +
> +    is preferred over:
> +
> +	struct strbuf;
> +
> +	void add_string(struct strbuf *buf, ...);
> +

I first thought this was a typo and should've been `add_strbuf` instead,
but I'm sure your intention was to show _other forms_ instead of
`<struct>_<verb>` here.

Maybe we could do s/is preferred over/is preferred over other forms,
like/. I dunno. It is probably fine as is :)

> +	void reset_strbuf(struct strbuf *buf);
> +
>  For Perl programs:
>
>   - Most of the C guidelines above apply.
> --
> 2.46.0.rc1.dirty

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 11:05 ` [PATCH 3/3] Documentation: document difference between release and free Patrick Steinhardt
@ 2024-07-24 11:46   ` Karthik Nayak
  2024-07-24 13:11     ` Patrick Steinhardt
  2024-07-24 16:52   ` Junio C Hamano
  1 sibling, 1 reply; 29+ messages in thread
From: Karthik Nayak @ 2024-07-24 11:46 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> We semi-regularly have discussions around whether a function shall be
> named `release()` or `free()`. For most of the part we use these two
> terminologies quite consistently though:
>

I noticed there is also `clear()` used in some places. Should we also
mention that we don't recommend using `clear()` WRT freeing memory?

>   - `release()` only frees internal state of a structure, whereas the
>     structure itself is not free'd.
>
>   - `free()` frees both internal state and the structure itself.
>
> Carve out a space where we can add idiomatic names for common functions
> in our coding guidelines. This space can get extended in the future when
> we feel the need to document more idiomatic names.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/CodingGuidelines | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 34fcbcb5a4..ace4c4ad0c 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -560,6 +560,18 @@ For C programs:
>
>  	void reset_strbuf(struct strbuf *buf);
>
> + - There are several common idiomatic names for functions performing
> +   specific tasks on structures:
> +
> +    - `<struct>_init()` initializes a structure without allocating the
> +      structure itself.
> +
> +    - `<struct>_release()` releases a structure's contents without
> +      freeing the structure.
> +
> +    - `<struct>_free()` releases a structure's contents and frees the
> +      structure.
> +
>  For Perl programs:
>
>   - Most of the C guidelines above apply.
> --
> 2.46.0.rc1.dirty

The patch itself looks good.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/3] Documentation: some coding guideline updates
  2024-07-24 11:05 [PATCH 0/3] Documentation: some coding guideline updates Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2024-07-24 11:05 ` [PATCH 3/3] Documentation: document difference between release and free Patrick Steinhardt
@ 2024-07-24 11:47 ` Karthik Nayak
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
  4 siblings, 0 replies; 29+ messages in thread
From: Karthik Nayak @ 2024-07-24 11:47 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 749 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> I've had it in my mind for a long time to update our coding guideline
> for some things that I frequently stumble over during code reviews. This
> small patch series fills in those gaps.
>

Thanks for the initiative. I've left some small comments, but overall
this already looks good to me.

> Thanks!
>
> Patrick
>
> Patrick Steinhardt (3):
>   Documentation: clarify indentation style for C preprocessor directives
>   Documentation: document naming schema for struct-related functions
>   Documentation: document difference between release and free
>
>  Documentation/CodingGuidelines | 41 ++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>
> --
> 2.46.0.rc1.dirty

Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 11:46   ` Karthik Nayak
@ 2024-07-24 13:11     ` Patrick Steinhardt
  2024-07-24 14:30       ` Phillip Wood
  0 siblings, 1 reply; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 13:11 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 884 bytes --]

On Wed, Jul 24, 2024 at 04:46:20AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We semi-regularly have discussions around whether a function shall be
> > named `release()` or `free()`. For most of the part we use these two
> > terminologies quite consistently though:
> >
> 
> I noticed there is also `clear()` used in some places. Should we also
> mention that we don't recommend using `clear()` WRT freeing memory?

In any case I think we should decide on eithe using `clear()` or using
`release()` for consistency's sake. Which of both  we use I don't quite
care, but the following very shoddy analysis clearly favors `release()`:

    $ git grep '_clear(' | wc -l
    844
    $ git grep '_release(' | wc -l
    2126

So yeah, I'm happy to explicitly mention that `clear()` shouldn't be
used in favor of `release()`.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] Documentation: document naming schema for struct-related functions
  2024-07-24 11:42   ` Karthik Nayak
@ 2024-07-24 13:12     ` Patrick Steinhardt
  0 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-24 13:12 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 899 bytes --]

On Wed, Jul 24, 2024 at 04:42:29AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> > index 1d09384f28..34fcbcb5a4 100644
> > --- a/Documentation/CodingGuidelines
> > +++ b/Documentation/CodingGuidelines
> > @@ -541,6 +541,25 @@ For C programs:
> >     use your own debugger and arguments. Example: `GIT_DEBUGGER="ddd --gdb"
> >     ./bin-wrappers/git log` (See `wrap-for-bin.sh`.)
> >
> > + - Functions that operate on a specific structure and which are used by
> > +   other subsystems shall be named after the structure. The function
> > +   name should start with the name of the structure followed by a verb.
> > +   E.g.
> > +
> 
> Nit: It would be nice to add `<struct>_<verb>` here, since we do something
> similar in the next patch.

Makes sense, will do.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 13:11     ` Patrick Steinhardt
@ 2024-07-24 14:30       ` Phillip Wood
  2024-07-24 18:02         ` Junio C Hamano
  0 siblings, 1 reply; 29+ messages in thread
From: Phillip Wood @ 2024-07-24 14:30 UTC (permalink / raw)
  To: Patrick Steinhardt, Karthik Nayak; +Cc: git

Hi Patrick

On 24/07/2024 14:11, Patrick Steinhardt wrote:
> On Wed, Jul 24, 2024 at 04:46:20AM -0700, Karthik Nayak wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>>> We semi-regularly have discussions around whether a function shall be
>>> named `release()` or `free()`. For most of the part we use these two
>>> terminologies quite consistently though:
>>>
>>
>> I noticed there is also `clear()` used in some places. Should we also
>> mention that we don't recommend using `clear()` WRT freeing memory?
> 
> In any case I think we should decide on eithe using `clear()` or using
> `release()` for consistency's sake. Which of both  we use I don't quite
> care, but the following very shoddy analysis clearly favors `release()`:
> 
>      $ git grep '_clear(' | wc -l
>      844
>      $ git grep '_release(' | wc -l
>      2126

I think a fairer comparison would be to look at function declarations, 
not all the call sites.

$ { git grep 'void [a-z_]*_release(' '*.h'
     git grep 'static void [a-z_]*_release(' '*.c'
   } | wc -l
47
$ { git grep 'void [a-z_]*_clear(' '*.h'
     git grep 'static void [a-z_]*_clear(' '*.c'
   } | wc -l
58

So we have more _clear() functions than _release() functions. I think 
there may sometimes be a semantic difference between _clear() and 
_release() as well where some _clear() functions zero out the struct 
after freeing the members.

Thanks for working on this it will be a useful addition to our coding 
guidelines

Best Wishes

Phillip

> So yeah, I'm happy to explicitly mention that `clear()` shouldn't be
> used in favor of `release()`.
> 
> Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives
  2024-07-24 11:05 ` [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives Patrick Steinhardt
@ 2024-07-24 16:41   ` Junio C Hamano
  2024-07-25  5:06     ` Junio C Hamano
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2024-07-24 16:41 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> There has recently been some discussion around how C preprocessor
> directives shall be indented [1]. This discussion was settled towards
> indenting after the hash by two spaces [2]. Document it as such.

It was settled to have space after and not before the hash, but I do
not recall ever agreeing to two spaces.  I prefer to increment by 1
for each level instead to keep the whole thing less distracting
while carrying meaningful information.

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] Documentation: document naming schema for struct-related functions
  2024-07-24 11:05 ` [PATCH 2/3] Documentation: document naming schema for struct-related functions Patrick Steinhardt
  2024-07-24 11:42   ` Karthik Nayak
@ 2024-07-24 16:50   ` Junio C Hamano
  2024-07-24 16:56     ` Junio C Hamano
  2024-07-30  6:41     ` Patrick Steinhardt
  1 sibling, 2 replies; 29+ messages in thread
From: Junio C Hamano @ 2024-07-24 16:50 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> + - Functions that operate on a specific structure and which are used by
> +   other subsystems shall be named after the structure.

I am not sure if this is a good guideline.  In the case of strbuf_,
you could say it is named after the structure, but I would actually
think that both structure and the functions are named after the
subsystem/API (i.e. we have "strbuf" that other subsystems can use).

> + The function
> +   name should start with the name of the structure followed by a verb.
> +   E.g.
> +
> +	struct strbuf;
> +
> +	void strbuf_add(struct strbuf *buf, ...);
> +
> +	void strbuf_reset(struct strbuf *buf);
> +
> +    is preferred over:
> +
> +	struct strbuf;
> +
> +	void add_string(struct strbuf *buf, ...);
> +
> +	void reset_strbuf(struct strbuf *buf);

Do we want to rename start_command(), finish_command(),
run_command() and pipe_command()?  child_process_start() sounds
somewhat ungrammatical.

By the way, some functions that have strbuf_ in their names do not
have anything to do with managing strings using the strbuf
structure, but they do things that are *not* about strings, but
happen to use strbuf as a way to either feed input to them or carry
output out of them.  They should be renamed away to lose "strbuf_"
in their names (e.g. strbuf_realpath() is about pathnames; it is
immaterial that the function happens to use strbuf to hold its
output but takes input from "const char *").

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 11:05 ` [PATCH 3/3] Documentation: document difference between release and free Patrick Steinhardt
  2024-07-24 11:46   ` Karthik Nayak
@ 2024-07-24 16:52   ` Junio C Hamano
  2024-07-30  6:43     ` Patrick Steinhardt
  1 sibling, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2024-07-24 16:52 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> We semi-regularly have discussions around whether a function shall be
> named `release()` or `free()`. For most of the part we use these two
> terminologies quite consistently though:
>
>   - `release()` only frees internal state of a structure, whereas the
>     structure itself is not free'd.
>
>   - `free()` frees both internal state and the structure itself.
>
> Carve out a space where we can add idiomatic names for common functions
> in our coding guidelines. This space can get extended in the future when
> we feel the need to document more idiomatic names.

We have _clear() in some subsystem/API.  Are we sure the listed two
are sufficient and _clear() can be replaced with one of them
(perhaps _release())?

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/CodingGuidelines | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 34fcbcb5a4..ace4c4ad0c 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -560,6 +560,18 @@ For C programs:
>  
>  	void reset_strbuf(struct strbuf *buf);
>  
> + - There are several common idiomatic names for functions performing
> +   specific tasks on structures:
> +
> +    - `<struct>_init()` initializes a structure without allocating the
> +      structure itself.
> +
> +    - `<struct>_release()` releases a structure's contents without
> +      freeing the structure.
> +
> +    - `<struct>_free()` releases a structure's contents and frees the
> +      structure.
> +
>  For Perl programs:
>  
>   - Most of the C guidelines above apply.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] Documentation: document naming schema for struct-related functions
  2024-07-24 16:50   ` Junio C Hamano
@ 2024-07-24 16:56     ` Junio C Hamano
  2024-07-30  6:41     ` Patrick Steinhardt
  1 sibling, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2024-07-24 16:56 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

Junio C Hamano <gitster@pobox.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> + - Functions that operate on a specific structure and which are used by
>> +   other subsystems shall be named after the structure.
> ...
>> + The function
>> +   name should start with the name of the structure followed by a verb.
>> +   E.g.
>> +
>> +	struct strbuf;
>> +
>> +	void strbuf_add(struct strbuf *buf, ...);
>> + ...
>> +	void strbuf_reset(struct strbuf *buf);

Another thing we may want to say about these "The primary data
structure subsystem 'S' deals with is called 'struct S' and the
functions that operate on 'struct S' are named S_<verb>()" theme is
that the convention for S_<verb>() functions is to have the operand,
which is always 'struct S *', of the verb as the first parameter.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 14:30       ` Phillip Wood
@ 2024-07-24 18:02         ` Junio C Hamano
  2024-07-30  6:49           ` Patrick Steinhardt
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2024-07-24 18:02 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Patrick Steinhardt, Karthik Nayak, git

Phillip Wood <phillip.wood123@gmail.com> writes:

>>> I noticed there is also `clear()` used in some places. Should we also
>>> mention that we don't recommend using `clear()` WRT freeing memory?
>> In any case I think we should decide on eithe using `clear()` or
>> using
>> `release()` for consistency's sake. Which of both  we use I don't quite
>> care, but the following very shoddy analysis clearly favors `release()`:
>>      $ git grep '_clear(' | wc -l
>>      844
>>      $ git grep '_release(' | wc -l
>>      2126
>
> I think a fairer comparison would be to look at function declarations,
> not all the call sites.
>
> $ { git grep 'void [a-z_]*_release(' '*.h'
>     git grep 'static void [a-z_]*_release(' '*.c'
>   } | wc -l
> 47
> $ { git grep 'void [a-z_]*_clear(' '*.h'
>     git grep 'static void [a-z_]*_clear(' '*.c'
>   } | wc -l
> 58
>
> So we have more _clear() functions than _release() functions. I think
> there may sometimes be a semantic difference between _clear() and
> _release() as well where some _clear() functions zero out the struct
> after freeing the members.
>
> Thanks for working on this it will be a useful addition to our coding
> guidelines

Thanks for doing a more thorough study of the current codebase.  I
tend to agree that the number of actual _clear() functions matter a
lot more than how many callsites call _clear(), and it would make
sense to standardise on it.  If everything else being equal, it does
not matter which one we pick, but it rarely happens that everything
else is equal.

 - "release" is a bit more cumbersome to type and read than "clear".

 - "clear" at least to me says more about the state of the thing
   after it got cleared (e.g., I would expect it would be filled
   with NUL bytes)

 - "release" places a lot more stress on what happens to the things
   that were contained before the release takes place.

For example, upon either "clear" or "release", I would expect
everything pointed by elements in an array member of the struct, and
the array pointed at by the member, are free'd when we are
"clearing/releasing" a strvec.  But I may not care what is left in
it after "release".  It can be left to hold all the bytes the struct
had before "release" got called, as anybody who called the function
are not supposed to look at the struct again anyway.  But we may
choose not to have such a variant and always clear the struct after
releasing resources it held, just for good hygiene.

So in short, I would consider that "clear = release + init".  If we
want to have both "clear" and "release" and have them distinct
meaning, that is fine.  If we want to simplify and do without "just
release and leave them dirty" variant, then we need only one name
for it, and I do not mind if we called it "release", even though
I would think "clear" is a better name for the action that behaves
as if "init" was done at the end to make it reusable.

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives
  2024-07-24 16:41   ` Junio C Hamano
@ 2024-07-25  5:06     ` Junio C Hamano
  2024-07-30  6:32       ` Patrick Steinhardt
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2024-07-25  5:06 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

Junio C Hamano <gitster@pobox.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> There has recently been some discussion around how C preprocessor
>> directives shall be indented [1]. This discussion was settled towards
>> indenting after the hash by two spaces [2]. Document it as such.
>
> It was settled to have space after and not before the hash, but I do
> not recall ever agreeing to two spaces.  I prefer to increment by 1
> for each level instead to keep the whole thing less distracting
> while carrying meaningful information.

Using the indentation consistently is a good thing I do not object
to.

Among our existing header and source files (excluding compat/ that
is full of borrowed sources), scanning output from

    $ git grep -e '#  *' '*.[ch]' ':!compat/'

tells me that

 - builtin/gc.c (3 lines)
 - git-compat-util.h (52 lines)
 - trace.c (2 lines)
 - wildmatch.c (6 lines)

use one space indent after '#' per level, while

 - hash.h (10 lines, inconsistently 2 uses 1 per level)
 - thread-utils.c (8 lines)

use two space indent after '#' per level.

In compat/ directory, only compat/nedmalloc uses 2-space indent.
There in the hierarchy are so many files we borrowed from GNU, whose
coding style sticks to one-space indent.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives
  2024-07-25  5:06     ` Junio C Hamano
@ 2024-07-30  6:32       ` Patrick Steinhardt
  0 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  6:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 2429 bytes --]

On Wed, Jul 24, 2024 at 10:06:03PM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Patrick Steinhardt <ps@pks.im> writes:
> >
> >> There has recently been some discussion around how C preprocessor
> >> directives shall be indented [1]. This discussion was settled towards
> >> indenting after the hash by two spaces [2]. Document it as such.
> >
> > It was settled to have space after and not before the hash, but I do
> > not recall ever agreeing to two spaces.  I prefer to increment by 1
> > for each level instead to keep the whole thing less distracting
> > while carrying meaningful information.
> 
> Using the indentation consistently is a good thing I do not object
> to.
> 
> Among our existing header and source files (excluding compat/ that
> is full of borrowed sources), scanning output from
> 
>     $ git grep -e '#  *' '*.[ch]' ':!compat/'
> 
> tells me that
> 
>  - builtin/gc.c (3 lines)
>  - git-compat-util.h (52 lines)
>  - trace.c (2 lines)
>  - wildmatch.c (6 lines)
> 
> use one space indent after '#' per level, while
> 
>  - hash.h (10 lines, inconsistently 2 uses 1 per level)
>  - thread-utils.c (8 lines)
> 
> use two space indent after '#' per level.
> 
> In compat/ directory, only compat/nedmalloc uses 2-space indent.
> There in the hierarchy are so many files we borrowed from GNU, whose
> coding style sticks to one-space indent.

Thanks for checking. I don't really mind whether we use one or two
spaces, as long as we are being consistent. I actually took the 2-space
indent from the comment for "IndentPPDirectives" in ".clang-format",
where it looks as if clang-format was using two spaces. So I took this
as being the official style we settled on.

One thing I noticed though is that clang-format is inconsistent with
either of those styles because it uses a tab character to indent nested
preprocessor directives. So the comment we have in that file is quite
misleading. We can fix this with the below change though. I'll add that
to this series.

Patrick

diff --git a/.clang-format b/.clang-format
index 16fd12253e..0e3606a8bb 100644
--- a/.clang-format
+++ b/.clang-format
@@ -105,6 +105,7 @@ IndentCaseLabels: false
 # #  include <foo>
 # #endif
 IndentPPDirectives: AfterHash
+PPIndentWidth: 1
 
 # Don't indent a function definition or declaration if it is wrapped after the
 # type

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] Documentation: document naming schema for struct-related functions
  2024-07-24 16:50   ` Junio C Hamano
  2024-07-24 16:56     ` Junio C Hamano
@ 2024-07-30  6:41     ` Patrick Steinhardt
  1 sibling, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  6:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 2872 bytes --]

On Wed, Jul 24, 2024 at 09:50:40AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > + - Functions that operate on a specific structure and which are used by
> > +   other subsystems shall be named after the structure.
> 
> I am not sure if this is a good guideline.  In the case of strbuf_,
> you could say it is named after the structure, but I would actually
> think that both structure and the functions are named after the
> subsystem/API (i.e. we have "strbuf" that other subsystems can use).

Well, in most cases I'd expect that the structure is named after the
subsystem/API, itself. I'm happy to relax this statement though and say
that functions should be named after the subsystem.

> > + The function
> > +   name should start with the name of the structure followed by a verb.
> > +   E.g.
> > +
> > +	struct strbuf;
> > +
> > +	void strbuf_add(struct strbuf *buf, ...);
> > +
> > +	void strbuf_reset(struct strbuf *buf);
> > +
> > +    is preferred over:
> > +
> > +	struct strbuf;
> > +
> > +	void add_string(struct strbuf *buf, ...);
> > +
> > +	void reset_strbuf(struct strbuf *buf);
> 
> Do we want to rename start_command(), finish_command(),
> run_command() and pipe_command()? 

I wouldn't quite go that far for now. We may want to slowly adapt some
parts of our interfaces over time. But my main goal is rather to make
the style consistent for _new_ interfaces we add.

> child_process_start() sounds somewhat ungrammatical.

It does, but I would argue that it is no different from `strbuf_reset()`
and other functions where we have the verb as a trailer. And I have to
say that I find it a ton easier to reason about code where we have the
subsystem it belongs to as a prefix as it neatly groups together things
and immediately sets you into the correct mindset of what to expect.
That is of course a question of preference, I'm not claiming that my
preferral is objectively the best.

But again, what I do want to see is consistency. Nobody is helped when
we mix both styles in my opinion. It makes writing, reading and
reviewing code harder than it has to be because you always have to
remember whether it is `string_list_free()`, `free_string_list()`,
`string_list_clear()` or `clear_string_list()`.

> By the way, some functions that have strbuf_ in their names do not
> have anything to do with managing strings using the strbuf
> structure, but they do things that are *not* about strings, but
> happen to use strbuf as a way to either feed input to them or carry
> output out of them.  They should be renamed away to lose "strbuf_"
> in their names (e.g. strbuf_realpath() is about pathnames; it is
> immaterial that the function happens to use strbuf to hold its
> output but takes input from "const char *").

Yeah, that's fair indeed.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 16:52   ` Junio C Hamano
@ 2024-07-30  6:43     ` Patrick Steinhardt
  0 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  6:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak

[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]

On Wed, Jul 24, 2024 at 09:52:20AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We semi-regularly have discussions around whether a function shall be
> > named `release()` or `free()`. For most of the part we use these two
> > terminologies quite consistently though:
> >
> >   - `release()` only frees internal state of a structure, whereas the
> >     structure itself is not free'd.
> >
> >   - `free()` frees both internal state and the structure itself.
> >
> > Carve out a space where we can add idiomatic names for common functions
> > in our coding guidelines. This space can get extended in the future when
> > we feel the need to document more idiomatic names.
> 
> We have _clear() in some subsystem/API.  Are we sure the listed two
> are sufficient and _clear() can be replaced with one of them
> (perhaps _release())?

I'd think that `clear()` can be replaced by `release()`, yes. But in
another branch I heard the argument that `clear()` is equivalent to
`release()` followed by `init()`, which I do like. The only downside is
that `init()` must not allocate memory in this case, as otherwise the
`clear()` function would lose the ability to release all resources
associated with its structure.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] Documentation: document difference between release and free
  2024-07-24 18:02         ` Junio C Hamano
@ 2024-07-30  6:49           ` Patrick Steinhardt
  0 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  6:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phillip Wood, Karthik Nayak, git

[-- Attachment #1: Type: text/plain, Size: 4089 bytes --]

On Wed, Jul 24, 2024 at 11:02:34AM -0700, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
> >>> I noticed there is also `clear()` used in some places. Should we also
> >>> mention that we don't recommend using `clear()` WRT freeing memory?
> >> In any case I think we should decide on eithe using `clear()` or
> >> using
> >> `release()` for consistency's sake. Which of both  we use I don't quite
> >> care, but the following very shoddy analysis clearly favors `release()`:
> >>      $ git grep '_clear(' | wc -l
> >>      844
> >>      $ git grep '_release(' | wc -l
> >>      2126
> >
> > I think a fairer comparison would be to look at function declarations,
> > not all the call sites.
> >
> > $ { git grep 'void [a-z_]*_release(' '*.h'
> >     git grep 'static void [a-z_]*_release(' '*.c'
> >   } | wc -l
> > 47
> > $ { git grep 'void [a-z_]*_clear(' '*.h'
> >     git grep 'static void [a-z_]*_clear(' '*.c'
> >   } | wc -l
> > 58
> >
> > So we have more _clear() functions than _release() functions. I think
> > there may sometimes be a semantic difference between _clear() and
> > _release() as well where some _clear() functions zero out the struct
> > after freeing the members.
> >
> > Thanks for working on this it will be a useful addition to our coding
> > guidelines
> 
> Thanks for doing a more thorough study of the current codebase.  I
> tend to agree that the number of actual _clear() functions matter a
> lot more than how many callsites call _clear(), and it would make
> sense to standardise on it.  If everything else being equal, it does
> not matter which one we pick, but it rarely happens that everything
> else is equal.

I'm not quite sure that I agree with this. I think coding style is most
heavily influenced by what you see most in a codebase. So I'd argue that
it is both declarations/definitions and callsites that influence the
general shape.

This of course means that interfaces like `struct strbuf` have way more
impact on our coding style than others, simply because it is being used
all over the place. But in my opinion that follows naturally, because
the coding style that we use should work best for what is being used
most often.

But anyway, this is splitting hairs :)

>  - "release" is a bit more cumbersome to type and read than "clear".
> 
>  - "clear" at least to me says more about the state of the thing
>    after it got cleared (e.g., I would expect it would be filled
>    with NUL bytes)
> 
>  - "release" places a lot more stress on what happens to the things
>    that were contained before the release takes place.
> 
> For example, upon either "clear" or "release", I would expect
> everything pointed by elements in an array member of the struct, and
> the array pointed at by the member, are free'd when we are
> "clearing/releasing" a strvec.  But I may not care what is left in
> it after "release".  It can be left to hold all the bytes the struct
> had before "release" got called, as anybody who called the function
> are not supposed to look at the struct again anyway.  But we may
> choose not to have such a variant and always clear the struct after
> releasing resources it held, just for good hygiene.
> 
> So in short, I would consider that "clear = release + init".  If we
> want to have both "clear" and "release" and have them distinct
> meaning, that is fine.  If we want to simplify and do without "just
> release and leave them dirty" variant, then we need only one name
> for it, and I do not mind if we called it "release", even though
> I would think "clear" is a better name for the action that behaves
> as if "init" was done at the end to make it reusable.

I actually like this definition. The only downside I see of defining
`clear = release + init` is that `init()` probably shouldn't be allowed
to allocate any memory in this case. Otherwise, calling `clear()` on a
structure would not cause us to free all resources associated with it,
which would be unexpected to me.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 0/5] Documentation: some coding guideline updates
  2024-07-24 11:05 [PATCH 0/3] Documentation: some coding guideline updates Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2024-07-24 11:47 ` [PATCH 0/3] Documentation: some coding guideline updates Karthik Nayak
@ 2024-07-30  7:24 ` Patrick Steinhardt
  2024-07-30  7:24   ` [PATCH v2 1/5] clang-format: fix indentation width for preprocessor directives Patrick Steinhardt
                     ` (6 more replies)
  4 siblings, 7 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  7:24 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 7866 bytes --]

Hi,

this is the second version of my patch series that aims to improve our
coding guidelines such that we arrive at a more consistent coding style.

Changes compared to v1:

  - Fix clang-format to use a single space to indent preprocessor
    directives instead of using tabs. Thus, this series is now built
    with kn/ci-clang-format at 1b8f306612 (ci/style-check: add
    `RemoveBracesLLVM` in CI job, 2024-07-23) merged into v2.46.0.

  - Adapt the coding guidelines accordingly to also only use a single
    space for indentation of nested preprocessor directives.

  - Adopt a proposal by Junio to more clearly spell out the relationship
    between a subsystem `S`, `struct S` and its functions `S_<verb>()`.

  - Document `S_clear()`-style functions. I have adopted the proposal by
    Junio hear, where `clear = release + init` with the restriction that
    `S_init()` must not allocate any resources.

  - Add another patch on top that makes variable initializers consistent
    in our coding guidelines. Our style is to add spaces between the
    curly brace and the initializers (`struct foo bar = { something };`).

I think I captured everything that came out of the discussion, but
please let me know in case I misinterpreted or forgot anything.

Thanks!

Patrick

Patrick Steinhardt (5):
  clang-format: fix indentation width for preprocessor directives
  Documentation: clarify indentation style for C preprocessor directives
  Documentation: document naming schema for structs and their functions
  Documentation: document idiomatic function names
  Documentation: consistently use spaces inside initializers

 .clang-format                  |  6 +++--
 Documentation/CodingGuidelines | 48 +++++++++++++++++++++++++++++++++-
 2 files changed, 51 insertions(+), 3 deletions(-)

Range-diff against v1:
-:  ---------- > 1:  c33ad700d6 clang-format: fix indentation width for preprocessor directives
1:  64e0b44993 ! 2:  e3baf01234 Documentation: clarify indentation style for C preprocessor directives
    @@ Metadata
      ## Commit message ##
         Documentation: clarify indentation style for C preprocessor directives
     
    -    There has recently been some discussion around how C preprocessor
    -    directives shall be indented [1]. This discussion was settled towards
    -    indenting after the hash by two spaces [2]. Document it as such.
    -
    -    [1]: https://lore.kernel.org/all/xmqqwmmm1bw6.fsf@gitster.g/
    -    [2]: https://lore.kernel.org/all/20240708092317.267915-2-karthik.188@gmail.com/
    +    In the preceding commit, we have settled on using a single space per
    +    nesting level to indent preprocessor directives. Clarify our coding
    +    guidelines accordingly.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ Documentation/CodingGuidelines: For C programs:
       - We use tabs to indent, and interpret tabs as taking up to
         8 spaces.
      
    -+ - Nested C preprocessor directives are indented after the hash by two
    -+   spaces per nesting level.
    ++ - Nested C preprocessor directives are indented after the hash by one
    ++   space per nesting level.
     +
     +	#if FOO
    -+	#  include <foo.h>
    -+	#  if BAR
    -+	#    include <bar.h>
    -+	#  endif
    ++	# include <foo.h>
    ++	# if BAR
    ++	#  include <bar.h>
    ++	# endif
     +	#endif
     +
       - We try to keep to at most 80 characters per line.
2:  7f07bf1f3b ! 3:  25f73b970d Documentation: document naming schema for struct-related functions
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    Documentation: document naming schema for struct-related functions
    +    Documentation: document naming schema for structs and their functions
     
         We nowadays have a proper mishmash of struct-related functions that are
         called `<verb>_<struct>` (e.g. `clear_prio_queue()`) versus functions
    @@ Documentation/CodingGuidelines: For C programs:
         use your own debugger and arguments. Example: `GIT_DEBUGGER="ddd --gdb"
         ./bin-wrappers/git log` (See `wrap-for-bin.sh`.)
      
    -+ - Functions that operate on a specific structure and which are used by
    -+   other subsystems shall be named after the structure. The function
    -+   name should start with the name of the structure followed by a verb.
    -+   E.g.
    ++ - The primary data structure that a subsystem 'S' deals with is called
    ++   `struct S`. Functions that operate on `struct S` are named
    ++   `S_<verb>()` and should generally receive a pointer to `struct S` as
    ++   first parameter. E.g.
     +
     +	struct strbuf;
     +
3:  5e1de3c315 ! 4:  d4ce00303f Documentation: document difference between release and free
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    Documentation: document difference between release and free
    +    Documentation: document idiomatic function names
     
         We semi-regularly have discussions around whether a function shall be
    -    named `release()` or `free()`. For most of the part we use these two
    -    terminologies quite consistently though:
    -
    -      - `release()` only frees internal state of a structure, whereas the
    -        structure itself is not free'd.
    -
    -      - `free()` frees both internal state and the structure itself.
    +    named `S_release()`, `S_clear()` or `S_free()`. Indeed, it may not be
    +    obvious which of these is preferable as we never really defined what
    +    each of these variants means exactly.
     
         Carve out a space where we can add idiomatic names for common functions
    -    in our coding guidelines. This space can get extended in the future when
    -    we feel the need to document more idiomatic names.
    +    in our coding guidelines and define each of those functions. Like this,
    +    we can get to a shared understanding of their respective semantics and
    +    can easily point towards our style guide in future discussions such that
    +    our codebase becomes more consistent over time.
    +
    +    Note that the intent is not to rename all functions which violate these
    +    semantics right away. Rather, the intent is to slowly converge towards a
    +    common style over time.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ Documentation/CodingGuidelines: For C programs:
      	void reset_strbuf(struct strbuf *buf);
      
     + - There are several common idiomatic names for functions performing
    -+   specific tasks on structures:
    ++   specific tasks on a structure `S`:
     +
    -+    - `<struct>_init()` initializes a structure without allocating the
    ++    - `S_init()` initializes a structure without allocating the
     +      structure itself.
     +
    -+    - `<struct>_release()` releases a structure's contents without
    -+      freeing the structure.
    ++    - `S_release()` releases a structure's contents without freeing the
    ++      structure.
    ++
    ++    - `S_clear()` is equivalent to `S_release()` followed by `S_init()`
    ++      such that the structure is directly usable after clearing it. When
    ++      `S_clear()` is provided, `S_init()` shall not allocate resources
    ++      that need to be released again.
     +
    -+    - `<struct>_free()` releases a structure's contents and frees the
    ++    - `S_free()` releases a structure's contents and frees the
     +      structure.
     +
      For Perl programs:
-:  ---------- > 5:  8789323ac7 Documentation: consistently use spaces inside initializers

-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/5] clang-format: fix indentation width for preprocessor directives
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
@ 2024-07-30  7:24   ` Patrick Steinhardt
  2024-07-30 14:19     ` Karthik Nayak
  2024-07-30  7:24   ` [PATCH v2 2/5] Documentation: clarify indentation style for C " Patrick Steinhardt
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  7:24 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1513 bytes --]

In [1], we have improved our clang-format configuration to also specify
the style for how to indent preprocessor directives. But while we have
settled the question of where to put the indentation, either before or
after the hash sign, we didn't specify exactly how to indent.

With the current configuration, clang-format uses tabs to indent each
level of nested preprocessor directives, which is in fact unintentional
and never done in our codebase. Instead, we use a mixture of indenting
by either one or two spaces, where using a single space is somewhat more
common.

Adapt our clang-format configuration accordingly by specifying an
indentation width of one space.

[1]: <20240708092317.267915-1-karthik.188@gmail.com>

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .clang-format | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/.clang-format b/.clang-format
index 16fd12253e..0b82f3c776 100644
--- a/.clang-format
+++ b/.clang-format
@@ -100,11 +100,13 @@ BreakStringLiterals: false
 # Switch statement body is always indented one level more than case labels.
 IndentCaseLabels: false
 
-# Indents directives before the hash.
+# Indents directives before the hash. Each level uses a single space for
+# indentation.
 # #if FOO
-# #  include <foo>
+# # include <foo>
 # #endif
 IndentPPDirectives: AfterHash
+PPIndentWidth: 1
 
 # Don't indent a function definition or declaration if it is wrapped after the
 # type
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/5] Documentation: clarify indentation style for C preprocessor directives
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
  2024-07-30  7:24   ` [PATCH v2 1/5] clang-format: fix indentation width for preprocessor directives Patrick Steinhardt
@ 2024-07-30  7:24   ` Patrick Steinhardt
  2024-07-30  7:24   ` [PATCH v2 3/5] Documentation: document naming schema for structs and their functions Patrick Steinhardt
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  7:24 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 961 bytes --]

In the preceding commit, we have settled on using a single space per
nesting level to indent preprocessor directives. Clarify our coding
guidelines accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/CodingGuidelines | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 1d92b2da03..65fba3b810 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -214,6 +214,16 @@ For C programs:
  - We use tabs to indent, and interpret tabs as taking up to
    8 spaces.
 
+ - Nested C preprocessor directives are indented after the hash by one
+   space per nesting level.
+
+	#if FOO
+	# include <foo.h>
+	# if BAR
+	#  include <bar.h>
+	# endif
+	#endif
+
  - We try to keep to at most 80 characters per line.
 
  - As a Git developer we assume you have a reasonably modern compiler
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 3/5] Documentation: document naming schema for structs and their functions
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
  2024-07-30  7:24   ` [PATCH v2 1/5] clang-format: fix indentation width for preprocessor directives Patrick Steinhardt
  2024-07-30  7:24   ` [PATCH v2 2/5] Documentation: clarify indentation style for C " Patrick Steinhardt
@ 2024-07-30  7:24   ` Patrick Steinhardt
  2024-07-30  7:24   ` [PATCH v2 4/5] Documentation: document idiomatic function names Patrick Steinhardt
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  7:24 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1763 bytes --]

We nowadays have a proper mishmash of struct-related functions that are
called `<verb>_<struct>` (e.g. `clear_prio_queue()`) versus functions
that are called `<struct>_<verb>` (e.g. `strbuf_clear()`). While the
former style may be easier to tie into a spoken conversation, most of
our communication happens in text anyway. Furthermore, prefixing
functions with the name of the structure they operate on makes it way
easier to group them together, see which functions are related, and will
also help folks who are using code completion.

Let's thus settle on one style, namely the one where functions start
with the name of the structure they operate on.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/CodingGuidelines | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 65fba3b810..a6a1ede204 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -541,6 +541,25 @@ For C programs:
    use your own debugger and arguments. Example: `GIT_DEBUGGER="ddd --gdb"
    ./bin-wrappers/git log` (See `wrap-for-bin.sh`.)
 
+ - The primary data structure that a subsystem 'S' deals with is called
+   `struct S`. Functions that operate on `struct S` are named
+   `S_<verb>()` and should generally receive a pointer to `struct S` as
+   first parameter. E.g.
+
+	struct strbuf;
+
+	void strbuf_add(struct strbuf *buf, ...);
+
+	void strbuf_reset(struct strbuf *buf);
+
+    is preferred over:
+
+	struct strbuf;
+
+	void add_string(struct strbuf *buf, ...);
+
+	void reset_strbuf(struct strbuf *buf);
+
 For Perl programs:
 
  - Most of the C guidelines above apply.
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 4/5] Documentation: document idiomatic function names
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-07-30  7:24   ` [PATCH v2 3/5] Documentation: document naming schema for structs and their functions Patrick Steinhardt
@ 2024-07-30  7:24   ` Patrick Steinhardt
  2024-07-30  7:24   ` [PATCH v2 5/5] Documentation: consistently use spaces inside initializers Patrick Steinhardt
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  7:24 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]

We semi-regularly have discussions around whether a function shall be
named `S_release()`, `S_clear()` or `S_free()`. Indeed, it may not be
obvious which of these is preferable as we never really defined what
each of these variants means exactly.

Carve out a space where we can add idiomatic names for common functions
in our coding guidelines and define each of those functions. Like this,
we can get to a shared understanding of their respective semantics and
can easily point towards our style guide in future discussions such that
our codebase becomes more consistent over time.

Note that the intent is not to rename all functions which violate these
semantics right away. Rather, the intent is to slowly converge towards a
common style over time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/CodingGuidelines | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index a6a1ede204..b63a8f9a44 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -560,6 +560,23 @@ For C programs:
 
 	void reset_strbuf(struct strbuf *buf);
 
+ - There are several common idiomatic names for functions performing
+   specific tasks on a structure `S`:
+
+    - `S_init()` initializes a structure without allocating the
+      structure itself.
+
+    - `S_release()` releases a structure's contents without freeing the
+      structure.
+
+    - `S_clear()` is equivalent to `S_release()` followed by `S_init()`
+      such that the structure is directly usable after clearing it. When
+      `S_clear()` is provided, `S_init()` shall not allocate resources
+      that need to be released again.
+
+    - `S_free()` releases a structure's contents and frees the
+      structure.
+
 For Perl programs:
 
  - Most of the C guidelines above apply.
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 5/5] Documentation: consistently use spaces inside initializers
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2024-07-30  7:24   ` [PATCH v2 4/5] Documentation: document idiomatic function names Patrick Steinhardt
@ 2024-07-30  7:24   ` Patrick Steinhardt
  2024-07-30 20:55   ` [PATCH v2 0/5] Documentation: some coding guideline updates Junio C Hamano
  2024-07-31  9:12   ` Karthik Nayak
  6 siblings, 0 replies; 29+ messages in thread
From: Patrick Steinhardt @ 2024-07-30  7:24 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1162 bytes --]

Our coding guide is inconsistent with how it uses spaces inside of
initializers (`struct foo bar = { something }`). While we mostly carry
the space between open and closing braces and the initialized members,
in one case we don't.

Fix this one instance such that we consistently carry the space. This is
also consistent with how clang-format formats such initializers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/CodingGuidelines | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index b63a8f9a44..3daa1f82bb 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -244,7 +244,7 @@ For C programs:
    . since around 2007 with 2b6854c863a, we have been using
      initializer elements which are not computable at load time. E.g.:
 
-	const char *args[] = {"constant", variable, NULL};
+	const char *args[] = { "constant", variable, NULL };
 
    . since early 2012 with e1327023ea, we have been using an enum
      definition whose last element is followed by a comma.  This, like
-- 
2.46.0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] clang-format: fix indentation width for preprocessor directives
  2024-07-30  7:24   ` [PATCH v2 1/5] clang-format: fix indentation width for preprocessor directives Patrick Steinhardt
@ 2024-07-30 14:19     ` Karthik Nayak
  0 siblings, 0 replies; 29+ messages in thread
From: Karthik Nayak @ 2024-07-30 14:19 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 864 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> In [1], we have improved our clang-format configuration to also specify
> the style for how to indent preprocessor directives. But while we have
> settled the question of where to put the indentation, either before or
> after the hash sign, we didn't specify exactly how to indent.
>
> With the current configuration, clang-format uses tabs to indent each
> level of nested preprocessor directives, which is in fact unintentional
> and never done in our codebase. Instead, we use a mixture of indenting
> by either one or two spaces, where using a single space is somewhat more
> common.
>
> Adapt our clang-format configuration accordingly by specifying an
> indentation width of one space.
>
> [1]: <20240708092317.267915-1-karthik.188@gmail.com>
>

I totally missed this, thanks for fixing it up. The patch looks good.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/5] Documentation: some coding guideline updates
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2024-07-30  7:24   ` [PATCH v2 5/5] Documentation: consistently use spaces inside initializers Patrick Steinhardt
@ 2024-07-30 20:55   ` Junio C Hamano
  2024-07-31  9:12   ` Karthik Nayak
  6 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2024-07-30 20:55 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak, Phillip Wood

Patrick Steinhardt <ps@pks.im> writes:

> this is the second version of my patch series that aims to improve our
> coding guidelines such that we arrive at a more consistent coding style.
> ...
> I think I captured everything that came out of the discussion, but
> please let me know in case I misinterpreted or forgot anything.

Nothing jumps out at me as wrong/missing in this version.

Will queue.  Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/5] Documentation: some coding guideline updates
  2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2024-07-30 20:55   ` [PATCH v2 0/5] Documentation: some coding guideline updates Junio C Hamano
@ 2024-07-31  9:12   ` Karthik Nayak
  6 siblings, 0 replies; 29+ messages in thread
From: Karthik Nayak @ 2024-07-31  9:12 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Junio C Hamano, Phillip Wood

[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> this is the second version of my patch series that aims to improve our
> coding guidelines such that we arrive at a more consistent coding style.
>
> Changes compared to v1:
>
>   - Fix clang-format to use a single space to indent preprocessor
>     directives instead of using tabs. Thus, this series is now built
>     with kn/ci-clang-format at 1b8f306612 (ci/style-check: add
>     `RemoveBracesLLVM` in CI job, 2024-07-23) merged into v2.46.0.
>
>   - Adapt the coding guidelines accordingly to also only use a single
>     space for indentation of nested preprocessor directives.
>
>   - Adopt a proposal by Junio to more clearly spell out the relationship
>     between a subsystem `S`, `struct S` and its functions `S_<verb>()`.
>
>   - Document `S_clear()`-style functions. I have adopted the proposal by
>     Junio hear, where `clear = release + init` with the restriction that
>     `S_init()` must not allocate any resources.
>
>   - Add another patch on top that makes variable initializers consistent
>     in our coding guidelines. Our style is to add spaces between the
>     curly brace and the initializers (`struct foo bar = { something };`).
>
> I think I captured everything that came out of the discussion, but
> please let me know in case I misinterpreted or forgot anything.
>
> Thanks!
>

This series seems good. I did a read-through and have no changes to
suggest! I'm really happy with these fixes which slowly improve the
quality of the codebase.

Thanks!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-07-31  9:12 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-24 11:05 [PATCH 0/3] Documentation: some coding guideline updates Patrick Steinhardt
2024-07-24 11:05 ` [PATCH 1/3] Documentation: clarify indentation style for C preprocessor directives Patrick Steinhardt
2024-07-24 16:41   ` Junio C Hamano
2024-07-25  5:06     ` Junio C Hamano
2024-07-30  6:32       ` Patrick Steinhardt
2024-07-24 11:05 ` [PATCH 2/3] Documentation: document naming schema for struct-related functions Patrick Steinhardt
2024-07-24 11:42   ` Karthik Nayak
2024-07-24 13:12     ` Patrick Steinhardt
2024-07-24 16:50   ` Junio C Hamano
2024-07-24 16:56     ` Junio C Hamano
2024-07-30  6:41     ` Patrick Steinhardt
2024-07-24 11:05 ` [PATCH 3/3] Documentation: document difference between release and free Patrick Steinhardt
2024-07-24 11:46   ` Karthik Nayak
2024-07-24 13:11     ` Patrick Steinhardt
2024-07-24 14:30       ` Phillip Wood
2024-07-24 18:02         ` Junio C Hamano
2024-07-30  6:49           ` Patrick Steinhardt
2024-07-24 16:52   ` Junio C Hamano
2024-07-30  6:43     ` Patrick Steinhardt
2024-07-24 11:47 ` [PATCH 0/3] Documentation: some coding guideline updates Karthik Nayak
2024-07-30  7:24 ` [PATCH v2 0/5] " Patrick Steinhardt
2024-07-30  7:24   ` [PATCH v2 1/5] clang-format: fix indentation width for preprocessor directives Patrick Steinhardt
2024-07-30 14:19     ` Karthik Nayak
2024-07-30  7:24   ` [PATCH v2 2/5] Documentation: clarify indentation style for C " Patrick Steinhardt
2024-07-30  7:24   ` [PATCH v2 3/5] Documentation: document naming schema for structs and their functions Patrick Steinhardt
2024-07-30  7:24   ` [PATCH v2 4/5] Documentation: document idiomatic function names Patrick Steinhardt
2024-07-30  7:24   ` [PATCH v2 5/5] Documentation: consistently use spaces inside initializers Patrick Steinhardt
2024-07-30 20:55   ` [PATCH v2 0/5] Documentation: some coding guideline updates Junio C Hamano
2024-07-31  9:12   ` Karthik Nayak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).