[PATCH 0/3] clang-format: fix rules to make the CI job cleaner

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] clang-format: fix rules to make the CI job cleaner
@ 2024-10-09 12:51 Karthik Nayak
  2024-10-09 12:55 ` [PATCH 1/3] clang-format: don't enforce the column limit Karthik Nayak
                   ` (5 more replies)
  0 siblings, 6 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-09 12:51 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

The clang-format CI job is currently cluttered due to too many errors being
reported. See some of the examples here:

* https://gitlab.com/gitlab-org/git/-/jobs/7854601948
* https://gitlab.com/gitlab-org/git/-/jobs/7843131109

So modify the clang-format with the following changes:
1. Remove the column limit since this is more of a guideline and we always
tend to prefer readability. This is the cause of most of the errors reported
by the tool and should cleanup the reports so we can actually focus on the real
remaining issues.
2. Don't align expressions after linebreaks to ensure that we instead rely on
'ContinuationIndentWidth'. This fix is rather small and ensures that instead of
trying to align wrapped expressions, we follow the indentation width.
3. Align the macro definitions. This is something we follow to keep the macros
readable.

I will still keep monitoring the jobs from time to time to ensure we can fine
tune more as needed, if someone see's something odd, do keep me in the loop.

Thanks

Karthik Nayak (3):
  clang-format: don't enforce the column limit
  clang-format: don't align expressions after linebreaks
  clang-format: align consecutive macro definitions

 .clang-format | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

-- 
2.46.2

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/3] clang-format: don't enforce the column limit
  2024-10-09 12:51 [PATCH 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
@ 2024-10-09 12:55 ` Karthik Nayak
  2024-10-09 15:45   ` Justin Tobler
  2024-10-09 12:56 ` [PATCH 2/3] clang-format: don't align expressions after linebreaks Karthik Nayak
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 41+ messages in thread
From: Karthik Nayak @ 2024-10-09 12:55 UTC (permalink / raw)
  To: karthik.188; +Cc: git

The current value for the column limit is set to 80. While this is as
expected, we often prefer readability over this strict limit. This means
it is common to find code which extends over 80 characters. So let's
change the column limit to be 0 instead. This ensures that the formatter
doesn't complain about code strictly not following the column limit.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/.clang-format b/.clang-format
index 41969eca4b..38910a3a53 100644
--- a/.clang-format
+++ b/.clang-format
@@ -12,7 +12,11 @@ UseTab: Always
 TabWidth: 8
 IndentWidth: 8
 ContinuationIndentWidth: 8
-ColumnLimit: 80
+
+# While we recommend keeping column limit to 80, we don't want to
+# enforce it as we generally are more lenient with this rule and
+# prefer to prioritize readability.
+ColumnLimit: 0

 # C Language specifics
 Language: Cpp
-- 
2.46.2

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 2/3] clang-format: don't align expressions after linebreaks
  2024-10-09 12:51 [PATCH 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-09 12:55 ` [PATCH 1/3] clang-format: don't enforce the column limit Karthik Nayak
@ 2024-10-09 12:56 ` Karthik Nayak
  2024-10-09 12:56 ` [PATCH 3/3] clang-format: align consecutive macro definitions Karthik Nayak
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-09 12:56 UTC (permalink / raw)
  To: karthik.188; +Cc: git

We enforce alignment of expressions after linebreaks. Which means for
code such as

    return a || b;

it will expect:

   return a ||
          b;

we instead want 'b' to be indent with tabs, which is already done by the
'ContinuationIndentWidth' variable. So let's explicitly set
'AlignOperands' to false.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/.clang-format b/.clang-format
index 38910a3a53..af12a038f7 100644
--- a/.clang-format
+++ b/.clang-format
@@ -43,10 +43,9 @@ AlignConsecutiveDeclarations: false
 #   int cccccccc;
 AlignEscapedNewlines: Left

-# Align operands of binary and ternary expressions
-# int aaa = bbbbbbbbbbb +
-#           cccccc;
-AlignOperands: true
+# Don't enforce alignment after linebreaks and instead
+# rely on the ContinuationIndentWidth value.
+AlignOperands: false

 # Don't align trailing comments
 # int a; // Comment a
-- 
2.46.2

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 3/3] clang-format: align consecutive macro definitions
  2024-10-09 12:51 [PATCH 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-09 12:55 ` [PATCH 1/3] clang-format: don't enforce the column limit Karthik Nayak
  2024-10-09 12:56 ` [PATCH 2/3] clang-format: don't align expressions after linebreaks Karthik Nayak
@ 2024-10-09 12:56 ` Karthik Nayak
  2024-10-10  8:27   ` Toon Claes
  2024-10-10 17:59 ` [PATCH v2 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 41+ messages in thread
From: Karthik Nayak @ 2024-10-09 12:56 UTC (permalink / raw)
  To: karthik.188; +Cc: git

We generally align consecutive macro definitions for better readability.
So let's add the rule in clang-format to follow this.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.clang-format b/.clang-format
index af12a038f7..d76d5f1e6a 100644
--- a/.clang-format
+++ b/.clang-format
@@ -36,6 +36,9 @@ AlignConsecutiveAssignments: false
 # double b = 3.14;
 AlignConsecutiveDeclarations: false

+# Align consecutive macro definitions.
+AlignConsecutiveMacros: true
+
 # Align escaped newlines as far left as possible
 # #define A   \
 #   int aaaa; \
-- 
2.46.2

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/3] clang-format: don't enforce the column limit
  2024-10-09 12:55 ` [PATCH 1/3] clang-format: don't enforce the column limit Karthik Nayak
@ 2024-10-09 15:45   ` Justin Tobler
  2024-10-09 22:32     ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-10-09 15:45 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

On 24/10/09 05:55AM, Karthik Nayak wrote:
> The current value for the column limit is set to 80. While this is as
> expected, we often prefer readability over this strict limit. This means
> it is common to find code which extends over 80 characters. So let's
> change the column limit to be 0 instead. This ensures that the formatter
> doesn't complain about code strictly not following the column limit.

The column limit does lead to quite a few false positives. At the same
time though, in some ways having a tool point out all the instances it
occurs does make it easier to review if any should be addressed.

If the goal is to have a CI job that we generally expect to pass, then
it makes sense to remove it. I don't feel super strongly either way.

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/3] clang-format: don't enforce the column limit
  2024-10-09 15:45   ` Justin Tobler
@ 2024-10-09 22:32     ` Junio C Hamano
  2024-10-10 16:48       ` karthik nayak
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2024-10-09 22:32 UTC (permalink / raw)
  To: Justin Tobler; +Cc: Karthik Nayak, git

Justin Tobler <jltobler@gmail.com> writes:

> On 24/10/09 05:55AM, Karthik Nayak wrote:
>> The current value for the column limit is set to 80. While this is as
>> expected, we often prefer readability over this strict limit. This means
>> it is common to find code which extends over 80 characters. So let's
>> change the column limit to be 0 instead. This ensures that the formatter
>> doesn't complain about code strictly not following the column limit.
>
> The column limit does lead to quite a few false positives. At the same
> time though, in some ways having a tool point out all the instances it
> occurs does make it easier to review if any should be addressed.
>
> If the goal is to have a CI job that we generally expect to pass, then
> it makes sense to remove it. I don't feel super strongly either way.

Is it possible for gatekeeper jobs to complain only on newly added
violations?  Then it is fine to have a limit with a bit of slack,
say like 96 columns (with 16-column readability slack).


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 3/3] clang-format: align consecutive macro definitions
  2024-10-09 12:56 ` [PATCH 3/3] clang-format: align consecutive macro definitions Karthik Nayak
@ 2024-10-10  8:27   ` Toon Claes
  0 siblings, 0 replies; 41+ messages in thread
From: Toon Claes @ 2024-10-10  8:27 UTC (permalink / raw)
  To: Karthik Nayak, karthik.188; +Cc: git

Karthik Nayak <karthik.188@gmail.com> writes:

> We generally align consecutive macro definitions for better readability.
> So let's add the rule in clang-format to follow this.
>
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>

I think it's valuable to add an example what bad vs good formatted code
looks like.

Otherwise no comments about this series.

--
Toon

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/3] clang-format: don't enforce the column limit
  2024-10-09 22:32     ` Junio C Hamano
@ 2024-10-10 16:48       ` karthik nayak
  0 siblings, 0 replies; 41+ messages in thread
From: karthik nayak @ 2024-10-10 16:48 UTC (permalink / raw)
  To: Junio C Hamano, Justin Tobler; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1313 bytes --]

Junio C Hamano <gitster@pobox.com> writes:

> Justin Tobler <jltobler@gmail.com> writes:
>
>> On 24/10/09 05:55AM, Karthik Nayak wrote:
>>> The current value for the column limit is set to 80. While this is as
>>> expected, we often prefer readability over this strict limit. This means
>>> it is common to find code which extends over 80 characters. So let's
>>> change the column limit to be 0 instead. This ensures that the formatter
>>> doesn't complain about code strictly not following the column limit.
>>
>> The column limit does lead to quite a few false positives. At the same
>> time though, in some ways having a tool point out all the instances it
>> occurs does make it easier to review if any should be addressed.
>>
>> If the goal is to have a CI job that we generally expect to pass, then
>> it makes sense to remove it. I don't feel super strongly either way.
> Is it possible for gatekeeper jobs to complain only on newly added
> violations?

The CI job is indeed only checking the newly added code. We do this
using 'git clang-format' which takes in the basecommit as a param. This
is in 'ci/run-style-check.sh'.

> Then it is fine to have a limit with a bit of slack,
> say like 96 columns (with 16-column readability slack).

This is a good idea. Let me add change this in the next version.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 0/3] clang-format: fix rules to make the CI job cleaner
  2024-10-09 12:51 [PATCH 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
                   ` (2 preceding siblings ...)
  2024-10-09 12:56 ` [PATCH 3/3] clang-format: align consecutive macro definitions Karthik Nayak
@ 2024-10-10 17:59 ` Karthik Nayak
  2024-10-10 17:59   ` [PATCH v2 1/3] clang-format: change column limit to 96 characters Karthik Nayak
                     ` (2 more replies)
  2024-10-12  1:49 ` [PATCH v3 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-18  8:46 ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Karthik Nayak
  5 siblings, 3 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-10 17:59 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon

The clang-format CI job is currently cluttered due to too many errors being
reported. See some of the examples here:

* https://gitlab.com/gitlab-org/git/-/jobs/7854601948
* https://gitlab.com/gitlab-org/git/-/jobs/7843131109

So modify the clang-format with the following changes:
1. Modify the column limit to 96 to provide some slack so we can make code more
readable and don't have to be constrained to 80 characters always. This is the
cause of most of the errors reported by the tool and should cleanup the reports
so we can actually focus on the real remaining issues.
2. Don't align expressions after linebreaks to ensure that we instead rely on
'ContinuationIndentWidth'. This fix is rather small and ensures that instead of
trying to align wrapped expressions, we follow the indentation width.
3. Align the macro definitions. This is something we follow to keep the macros
readable. 

I will still keep monitoring the jobs from time to time to ensure we can fine
tune more as needed, if someone see's something odd, do keep me in the loop.

Thanks

Changes over the previous version:
1. Change the column limit from 0 to 96, this way we still have a column limit
but provide some slack.
2. Add an example for the third commit.

Karthik Nayak (3):
  clang-format: change column limit to 96 characters
  clang-format: don't align expressions after linebreaks
  clang-format: align consecutive macro definitions

 .clang-format | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

Range-diff against v1:
1:  34cd0c29f6 ! 1:  e22ffbe0f6 clang-format: don't enforce the column limit
    @@ Metadata
     Author: Karthik Nayak <karthik.188@gmail.com>
     
      ## Commit message ##
    -    clang-format: don't enforce the column limit
    +    clang-format: change column limit to 96 characters
     
         The current value for the column limit is set to 80. While this is as
         expected, we often prefer readability over this strict limit. This means
         it is common to find code which extends over 80 characters. So let's
    -    change the column limit to be 0 instead. This ensures that the formatter
    -    doesn't complain about code strictly not following the column limit.
    +    change the column limit to be 96 instead. This provides some slack so we
    +    can ensure readability takes preference over the 80 character hard
    +    limit.
     
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
     
    @@ .clang-format: UseTab: Always
      ContinuationIndentWidth: 8
     -ColumnLimit: 80
     +
    -+# While we recommend keeping column limit to 80, we don't want to
    -+# enforce it as we generally are more lenient with this rule and
    -+# prefer to prioritize readability.
    -+ColumnLimit: 0
    ++# While we recommend keeping column limit to 80, we want to also provide
    ++# some slack to maintain readability.
    ++ColumnLimit: 96
      
      # C Language specifics
      Language: Cpp
2:  3557012fea = 2:  b55d5d2c14 clang-format: don't align expressions after linebreaks
3:  ad023a6cf6 ! 3:  6ebcd2690e clang-format: align consecutive macro definitions
    @@ Metadata
      ## Commit message ##
         clang-format: align consecutive macro definitions
     
    -    We generally align consecutive macro definitions for better readability.
    +    We generally align consecutive macro definitions for better readability:
    +
    +      #define OUTPUT_ANNOTATE_COMPAT      (1U<<0)
    +      #define OUTPUT_LONG_OBJECT_NAME     (1U<<1)
    +      #define OUTPUT_RAW_TIMESTAMP        (1U<<2)
    +      #define OUTPUT_PORCELAIN            (1U<<3)
    +
    +    over
    +
    +      #define OUTPUT_ANNOTATE_COMPAT (1U<<0)
    +      #define OUTPUT_LONG_OBJECT_NAME (1U<<1)
    +      #define OUTPUT_RAW_TIMESTAMP (1U<<2)
    +      #define OUTPUT_PORCELAIN (1U<<3)
    +
         So let's add the rule in clang-format to follow this.
     
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
-- 
2.47.0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 1/3] clang-format: change column limit to 96 characters
  2024-10-10 17:59 ` [PATCH v2 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
@ 2024-10-10 17:59   ` Karthik Nayak
  2024-10-10 18:11     ` Kyle Lippincott
  2024-10-10 17:59   ` [PATCH v2 2/3] clang-format: don't align expressions after linebreaks Karthik Nayak
  2024-10-10 17:59   ` [PATCH v2 3/3] clang-format: align consecutive macro definitions Karthik Nayak
  2 siblings, 1 reply; 41+ messages in thread
From: Karthik Nayak @ 2024-10-10 17:59 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon

The current value for the column limit is set to 80. While this is as
expected, we often prefer readability over this strict limit. This means
it is common to find code which extends over 80 characters. So let's
change the column limit to be 96 instead. This provides some slack so we
can ensure readability takes preference over the 80 character hard
limit.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/.clang-format b/.clang-format
index 41969eca4b..684ab32d28 100644
--- a/.clang-format
+++ b/.clang-format
@@ -12,7 +12,10 @@ UseTab: Always
 TabWidth: 8
 IndentWidth: 8
 ContinuationIndentWidth: 8
-ColumnLimit: 80
+
+# While we recommend keeping column limit to 80, we want to also provide
+# some slack to maintain readability.
+ColumnLimit: 96
 
 # C Language specifics
 Language: Cpp
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 2/3] clang-format: don't align expressions after linebreaks
  2024-10-10 17:59 ` [PATCH v2 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-10 17:59   ` [PATCH v2 1/3] clang-format: change column limit to 96 characters Karthik Nayak
@ 2024-10-10 17:59   ` Karthik Nayak
  2024-10-10 17:59   ` [PATCH v2 3/3] clang-format: align consecutive macro definitions Karthik Nayak
  2 siblings, 0 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-10 17:59 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon

We enforce alignment of expressions after linebreaks. Which means for
code such as

    return a || b;

it will expect:

   return a ||
          b;

we instead want 'b' to be indent with tabs, which is already done by the
'ContinuationIndentWidth' variable. So let's explicitly set
'AlignOperands' to false.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/.clang-format b/.clang-format
index 684ab32d28..d655b08ff6 100644
--- a/.clang-format
+++ b/.clang-format
@@ -42,10 +42,9 @@ AlignConsecutiveDeclarations: false
 #   int cccccccc;
 AlignEscapedNewlines: Left
 
-# Align operands of binary and ternary expressions
-# int aaa = bbbbbbbbbbb +
-#           cccccc;
-AlignOperands: true
+# Don't enforce alignment after linebreaks and instead
+# rely on the ContinuationIndentWidth value.
+AlignOperands: false
 
 # Don't align trailing comments
 # int a; // Comment a
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 3/3] clang-format: align consecutive macro definitions
  2024-10-10 17:59 ` [PATCH v2 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-10 17:59   ` [PATCH v2 1/3] clang-format: change column limit to 96 characters Karthik Nayak
  2024-10-10 17:59   ` [PATCH v2 2/3] clang-format: don't align expressions after linebreaks Karthik Nayak
@ 2024-10-10 17:59   ` Karthik Nayak
  2 siblings, 0 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-10 17:59 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon

We generally align consecutive macro definitions for better readability:

  #define OUTPUT_ANNOTATE_COMPAT      (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME     (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP        (1U<<2)
  #define OUTPUT_PORCELAIN            (1U<<3)

over

  #define OUTPUT_ANNOTATE_COMPAT (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP (1U<<2)
  #define OUTPUT_PORCELAIN (1U<<3)

So let's add the rule in clang-format to follow this.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.clang-format b/.clang-format
index d655b08ff6..22ae882923 100644
--- a/.clang-format
+++ b/.clang-format
@@ -35,6 +35,9 @@ AlignConsecutiveAssignments: false
 # double b = 3.14;
 AlignConsecutiveDeclarations: false
 
+# Align consecutive macro definitions.
+AlignConsecutiveMacros: true
+
 # Align escaped newlines as far left as possible
 # #define A   \
 #   int aaaa; \
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/3] clang-format: change column limit to 96 characters
  2024-10-10 17:59   ` [PATCH v2 1/3] clang-format: change column limit to 96 characters Karthik Nayak
@ 2024-10-10 18:11     ` Kyle Lippincott
  2024-10-10 19:49       ` karthik nayak
  0 siblings, 1 reply; 41+ messages in thread
From: Kyle Lippincott @ 2024-10-10 18:11 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, gitster, jltobler, toon

On Thu, Oct 10, 2024 at 11:00 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> The current value for the column limit is set to 80. While this is as
> expected, we often prefer readability over this strict limit. This means
> it is common to find code which extends over 80 characters. So let's
> change the column limit to be 96 instead. This provides some slack so we
> can ensure readability takes preference over the 80 character hard
> limit.
>
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
>  .clang-format | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/.clang-format b/.clang-format
> index 41969eca4b..684ab32d28 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -12,7 +12,10 @@ UseTab: Always
>  TabWidth: 8
>  IndentWidth: 8
>  ContinuationIndentWidth: 8
> -ColumnLimit: 80
> +
> +# While we recommend keeping column limit to 80, we want to also provide
> +# some slack to maintain readability.
> +ColumnLimit: 96
>
>  # C Language specifics
>  Language: Cpp
> --
> 2.47.0
>
>

I think this means that the next automated `clang-format` invocation
will un-wrap lines that were wrapped at 80 columns (not characters)
but fit in 96 columns. Modifying this setting and running
`clang-format -i *.{c,h}` produces a lot of diffs of that kind. I
don't think there's a way of setting a soft column limit in
clang-format.

Personally, I'd be fine with a higher column limit, but we'd need to
make a conscious change to the style guidelines for that.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/3] clang-format: change column limit to 96 characters
  2024-10-10 18:11     ` Kyle Lippincott
@ 2024-10-10 19:49       ` karthik nayak
  2024-10-10 20:09         ` Kyle Lippincott
  0 siblings, 1 reply; 41+ messages in thread
From: karthik nayak @ 2024-10-10 19:49 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: git, gitster, jltobler, toon

[-- Attachment #1: Type: text/plain, Size: 2154 bytes --]

Kyle Lippincott <spectral@google.com> writes:

> On Thu, Oct 10, 2024 at 11:00 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>>
>> The current value for the column limit is set to 80. While this is as
>> expected, we often prefer readability over this strict limit. This means
>> it is common to find code which extends over 80 characters. So let's
>> change the column limit to be 96 instead. This provides some slack so we
>> can ensure readability takes preference over the 80 character hard
>> limit.
>>
>> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
>> ---
>>  .clang-format | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/.clang-format b/.clang-format
>> index 41969eca4b..684ab32d28 100644
>> --- a/.clang-format
>> +++ b/.clang-format
>> @@ -12,7 +12,10 @@ UseTab: Always
>>  TabWidth: 8
>>  IndentWidth: 8
>>  ContinuationIndentWidth: 8
>> -ColumnLimit: 80
>> +
>> +# While we recommend keeping column limit to 80, we want to also provide
>> +# some slack to maintain readability.
>> +ColumnLimit: 96
>>
>>  # C Language specifics
>>  Language: Cpp
>> --
>> 2.47.0
>>
>>
>
> I think this means that the next automated `clang-format` invocation
> will un-wrap lines that were wrapped at 80 columns (not characters)
> but fit in 96 columns. Modifying this setting and running
> `clang-format -i *.{c,h}` produces a lot of diffs of that kind. I
> don't think there's a way of setting a soft column limit in
> clang-format.

Ah! Good point.

> Personally, I'd be fine with a higher column limit, but we'd need to
> make a conscious change to the style guidelines for that.

With this, I would say that the best choice here would be to actually
set it to 0 like the previous version. So that we don't actually enforce
the column limit.

We could perhaps set the value here in the '.clang-format' to 0. While
also setting 'max_line_length = 95' in the '.editorconfig'. That would
mean that we don't enforce a width, but we nudge editors to wrap at 95
characters. Here contributors would still have the power to decide the
adequate width as needed.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/3] clang-format: change column limit to 96 characters
  2024-10-10 19:49       ` karthik nayak
@ 2024-10-10 20:09         ` Kyle Lippincott
  0 siblings, 0 replies; 41+ messages in thread
From: Kyle Lippincott @ 2024-10-10 20:09 UTC (permalink / raw)
  To: karthik nayak; +Cc: git, gitster, jltobler, toon

On Thu, Oct 10, 2024 at 12:49 PM karthik nayak <karthik.188@gmail.com> wrote:
>
> Kyle Lippincott <spectral@google.com> writes:
>
> > On Thu, Oct 10, 2024 at 11:00 AM Karthik Nayak <karthik.188@gmail.com> wrote:
> >>
> >> The current value for the column limit is set to 80. While this is as
> >> expected, we often prefer readability over this strict limit. This means
> >> it is common to find code which extends over 80 characters. So let's
> >> change the column limit to be 96 instead. This provides some slack so we
> >> can ensure readability takes preference over the 80 character hard
> >> limit.
> >>
> >> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> >> ---
> >>  .clang-format | 5 ++++-
> >>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/.clang-format b/.clang-format
> >> index 41969eca4b..684ab32d28 100644
> >> --- a/.clang-format
> >> +++ b/.clang-format
> >> @@ -12,7 +12,10 @@ UseTab: Always
> >>  TabWidth: 8
> >>  IndentWidth: 8
> >>  ContinuationIndentWidth: 8
> >> -ColumnLimit: 80
> >> +
> >> +# While we recommend keeping column limit to 80, we want to also provide
> >> +# some slack to maintain readability.
> >> +ColumnLimit: 96
> >>
> >>  # C Language specifics
> >>  Language: Cpp
> >> --
> >> 2.47.0
> >>
> >>
> >
> > I think this means that the next automated `clang-format` invocation
> > will un-wrap lines that were wrapped at 80 columns (not characters)
> > but fit in 96 columns. Modifying this setting and running
> > `clang-format -i *.{c,h}` produces a lot of diffs of that kind. I
> > don't think there's a way of setting a soft column limit in
> > clang-format.
>
> Ah! Good point.
>
> > Personally, I'd be fine with a higher column limit, but we'd need to
> > make a conscious change to the style guidelines for that.
>
> With this, I would say that the best choice here would be to actually
> set it to 0 like the previous version. So that we don't actually enforce
> the column limit.
>
> We could perhaps set the value here in the '.clang-format' to 0. While
> also setting 'max_line_length = 95' in the '.editorconfig'. That would
> mean that we don't enforce a width, but we nudge editors to wrap at 95
> characters. Here contributors would still have the power to decide the
> adequate width as needed.

One thing to be cautious of is column vs character distinctions.
Because we use tabs[^1], one character regularly has a display width
of 8 columns.

[^1]: Another personal opinion, and this is going to be very
contentious so please don't think that others agree with me on this,
because I suspect I'm in the minority: tabs are a massive mistake. The
*only* benefit we get from our current use of tabs is smaller source
files, and we get numerous headaches because of them. I could go on
about this, because I have Opinions here, but this probably isn't the
right place ;)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 0/3] clang-format: fix rules to make the CI job cleaner
  2024-10-09 12:51 [PATCH 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
                   ` (3 preceding siblings ...)
  2024-10-10 17:59 ` [PATCH v2 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
@ 2024-10-12  1:49 ` Karthik Nayak
  2024-10-12  1:49   ` [PATCH v3 1/3] clang-format: re-adjust line break penalties Karthik Nayak
                     ` (2 more replies)
  2024-10-18  8:46 ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Karthik Nayak
  5 siblings, 3 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-12  1:49 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon, johannes.schindelin, spectral

The clang-format CI job is currently cluttered due to too many errors being
reported. See some of the examples here:

* https://gitlab.com/gitlab-org/git/-/jobs/7854601948
* https://gitlab.com/gitlab-org/git/-/jobs/7843131109

So modify the clang-format with the following changes:
1. Modify the penalties for linebreaks to be more considerate towards
readability. The commit goes into detail explaining how/why.  
2. Don't align expressions after linebreaks to ensure that we instead rely on
'ContinuationIndentWidth'. This fix is rather small and ensures that instead of
trying to align wrapped expressions, we follow the indentation width.
3. Align the macro definitions. This is something we follow to keep the macros
readable. 

I will still keep monitoring the jobs from time to time to ensure we can fine
tune more as needed, if someone see's something odd, do keep me in the loop.

Thanks

Changes over the previous version:
1. I figured that we can keep the column limit to the previous 80 while still
having some breathing room by tweaking the penalties associated with line breaks.
Also CC'in Johannes here, since this builds on top of his changes.

Karthik Nayak (3):
  clang-format: re-adjust line break penalties
  clang-format: align consecutive macro definitions
  clang-format: don't align expressions after linebreaks

 .clang-format | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

Range-diff against v2:
1:  e22ffbe0f6 ! 1:  74bbd2f9db clang-format: change column limit to 96 characters
    @@ Metadata
     Author: Karthik Nayak <karthik.188@gmail.com>
     
      ## Commit message ##
    -    clang-format: change column limit to 96 characters
    +    clang-format: re-adjust line break penalties
     
    -    The current value for the column limit is set to 80. While this is as
    -    expected, we often prefer readability over this strict limit. This means
    -    it is common to find code which extends over 80 characters. So let's
    -    change the column limit to be 96 instead. This provides some slack so we
    -    can ensure readability takes preference over the 80 character hard
    -    limit.
    +    In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
    +    adjusted the line break penalties to really fine tune what we care about
    +    while doing line breaks. Modify some of those to be more inline with
    +    what we care about in the Git project now.
    +
    +    We need to understand that the values set to penalties in
    +    '.clang-format' are relative to each other and do not hold any absolute
    +    value. The penalty arguments take an 'Unsigned' value, so we have some
    +    liberty over the values we can set.
    +
    +    First, in that commit, we decided, that under no circumstances do we
    +    want to exceed 80 characters. This seems a bit too strict. We do
    +    overshoot this limit from time to time to prioritize readability. So
    +    let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
    +    that we add a penalty of 10 for each character that exceeds the column
    +    limit. By itself this is enough to restrict to column limit. Tuning
    +    other penalties in relation to this is what is important.
    +
    +    The penalty `PenaltyBreakAssignment` talks about the penalty for
    +    breaking an assignment operator on to the next line. In our project, we
    +    are okay with this, so giving a value of 5, which is below the value for
    +    'PenaltyExcessCharacter' ensures that in the end, even 1 character over
    +    the column limit is not worth keeping an assignment on the same line.
    +
    +    Similarly set the penalty for breaking before the first call parameter
    +    'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
    +    comments 'PenaltyBreakComment' and the penalty for breaking string
    +    literals 'PenaltyBreakString' also to 5.
    +
    +    Finally, we really care about not breaking the return type into its own
    +    line and we really care about not breaking before an open parenthesis.
    +    This avoids weird formatting like:
    +
    +       static const struct strbuf *
    +              a_really_really_large_function_name(struct strbuf resolved,
    +              const char *path, int flags)
    +
    +    or
    +
    +       static const struct strbuf *a_really_really_large_function_name(
    +                struct strbuf resolved, const char *path, int flags)
    +
    +    to instead have something more readable like:
    +
    +       static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
    +              const char *path, int flags)
    +
    +    This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
    +    and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
    +    few characters above the 80 column limit to make code more readable.
     
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
     
      ## .clang-format ##
    -@@ .clang-format: UseTab: Always
    - TabWidth: 8
    - IndentWidth: 8
    - ContinuationIndentWidth: 8
    --ColumnLimit: 80
    -+
    -+# While we recommend keeping column limit to 80, we want to also provide
    -+# some slack to maintain readability.
    -+ColumnLimit: 96
    +@@ .clang-format: KeepEmptyLinesAtTheStartOfBlocks: false
    + 
    + # Penalties
    + # This decides what order things should be done if a line is too long
    +-PenaltyBreakAssignment: 10
    +-PenaltyBreakBeforeFirstCallParameter: 30
    +-PenaltyBreakComment: 10
    ++PenaltyBreakAssignment: 5
    ++PenaltyBreakBeforeFirstCallParameter: 5
    ++PenaltyBreakComment: 5
    + PenaltyBreakFirstLessLess: 0
    +-PenaltyBreakString: 10
    +-PenaltyExcessCharacter: 100
    +-PenaltyReturnTypeOnItsOwnLine: 60
    ++PenaltyBreakOpenParenthesis: 300
    ++PenaltyBreakString: 5
    ++PenaltyExcessCharacter: 10
    ++PenaltyReturnTypeOnItsOwnLine: 300
      
    - # C Language specifics
    - Language: Cpp
    + # Don't sort #include's
    + SortIncludes: false
3:  6ebcd2690e = 2:  1586d53769 clang-format: align consecutive macro definitions
2:  b55d5d2c14 ! 3:  36a53299c1 clang-format: don't align expressions after linebreaks
    @@ Commit message
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
     
      ## .clang-format ##
    -@@ .clang-format: AlignConsecutiveDeclarations: false
    +@@ .clang-format: AlignConsecutiveMacros: true
      #   int cccccccc;
      AlignEscapedNewlines: Left
      
-- 
2.47.0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-12  1:49 ` [PATCH v3 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
@ 2024-10-12  1:49   ` Karthik Nayak
  2024-10-14  9:08     ` Toon Claes
  2024-10-14 20:59     ` Kyle Lippincott
  2024-10-12  1:49   ` [PATCH v3 2/3] clang-format: align consecutive macro definitions Karthik Nayak
  2024-10-12  1:49   ` [PATCH v3 3/3] clang-format: don't align expressions after linebreaks Karthik Nayak
  2 siblings, 2 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-12  1:49 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon, johannes.schindelin, spectral

In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
adjusted the line break penalties to really fine tune what we care about
while doing line breaks. Modify some of those to be more inline with
what we care about in the Git project now.

We need to understand that the values set to penalties in
'.clang-format' are relative to each other and do not hold any absolute
value. The penalty arguments take an 'Unsigned' value, so we have some
liberty over the values we can set.

First, in that commit, we decided, that under no circumstances do we
want to exceed 80 characters. This seems a bit too strict. We do
overshoot this limit from time to time to prioritize readability. So
let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
that we add a penalty of 10 for each character that exceeds the column
limit. By itself this is enough to restrict to column limit. Tuning
other penalties in relation to this is what is important.

The penalty `PenaltyBreakAssignment` talks about the penalty for
breaking an assignment operator on to the next line. In our project, we
are okay with this, so giving a value of 5, which is below the value for
'PenaltyExcessCharacter' ensures that in the end, even 1 character over
the column limit is not worth keeping an assignment on the same line.

Similarly set the penalty for breaking before the first call parameter
'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
comments 'PenaltyBreakComment' and the penalty for breaking string
literals 'PenaltyBreakString' also to 5.

Finally, we really care about not breaking the return type into its own
line and we really care about not breaking before an open parenthesis.
This avoids weird formatting like:

   static const struct strbuf *
          a_really_really_large_function_name(struct strbuf resolved,
          const char *path, int flags)

or

   static const struct strbuf *a_really_really_large_function_name(
   	    struct strbuf resolved, const char *path, int flags)

to instead have something more readable like:

   static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
          const char *path, int flags)

This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
few characters above the 80 column limit to make code more readable.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/.clang-format b/.clang-format
index 41969eca4b..66a2360ae5 100644
--- a/.clang-format
+++ b/.clang-format
@@ -209,13 +209,14 @@ KeepEmptyLinesAtTheStartOfBlocks: false

 # Penalties
 # This decides what order things should be done if a line is too long
-PenaltyBreakAssignment: 10
-PenaltyBreakBeforeFirstCallParameter: 30
-PenaltyBreakComment: 10
+PenaltyBreakAssignment: 5
+PenaltyBreakBeforeFirstCallParameter: 5
+PenaltyBreakComment: 5
 PenaltyBreakFirstLessLess: 0
-PenaltyBreakString: 10
-PenaltyExcessCharacter: 100
-PenaltyReturnTypeOnItsOwnLine: 60
+PenaltyBreakOpenParenthesis: 300
+PenaltyBreakString: 5
+PenaltyExcessCharacter: 10
+PenaltyReturnTypeOnItsOwnLine: 300

 # Don't sort #include's
 SortIncludes: false
-- 
2.47.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 2/3] clang-format: align consecutive macro definitions
  2024-10-12  1:49 ` [PATCH v3 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-12  1:49   ` [PATCH v3 1/3] clang-format: re-adjust line break penalties Karthik Nayak
@ 2024-10-12  1:49   ` Karthik Nayak
  2024-10-14 21:12     ` Kyle Lippincott
  2024-10-12  1:49   ` [PATCH v3 3/3] clang-format: don't align expressions after linebreaks Karthik Nayak
  2 siblings, 1 reply; 41+ messages in thread
From: Karthik Nayak @ 2024-10-12  1:49 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon, johannes.schindelin, spectral

We generally align consecutive macro definitions for better readability:

  #define OUTPUT_ANNOTATE_COMPAT      (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME     (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP        (1U<<2)
  #define OUTPUT_PORCELAIN            (1U<<3)

over

  #define OUTPUT_ANNOTATE_COMPAT (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP (1U<<2)
  #define OUTPUT_PORCELAIN (1U<<3)

So let's add the rule in clang-format to follow this.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.clang-format b/.clang-format
index 66a2360ae5..9547fe1b77 100644
--- a/.clang-format
+++ b/.clang-format
@@ -32,6 +32,9 @@ AlignConsecutiveAssignments: false
 # double b = 3.14;
 AlignConsecutiveDeclarations: false
 
+# Align consecutive macro definitions.
+AlignConsecutiveMacros: true
+
 # Align escaped newlines as far left as possible
 # #define A   \
 #   int aaaa; \
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v3 3/3] clang-format: don't align expressions after linebreaks
  2024-10-12  1:49 ` [PATCH v3 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-12  1:49   ` [PATCH v3 1/3] clang-format: re-adjust line break penalties Karthik Nayak
  2024-10-12  1:49   ` [PATCH v3 2/3] clang-format: align consecutive macro definitions Karthik Nayak
@ 2024-10-12  1:49   ` Karthik Nayak
  2024-10-14 21:23     ` Kyle Lippincott
  2 siblings, 1 reply; 41+ messages in thread
From: Karthik Nayak @ 2024-10-12  1:49 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, toon, johannes.schindelin, spectral

We enforce alignment of expressions after linebreaks. Which means for
code such as

    return a || b;

it will expect:

   return a ||
          b;

we instead want 'b' to be indent with tabs, which is already done by the
'ContinuationIndentWidth' variable. So let's explicitly set
'AlignOperands' to false.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/.clang-format b/.clang-format
index 9547fe1b77..b48e7813e4 100644
--- a/.clang-format
+++ b/.clang-format
@@ -42,10 +42,9 @@ AlignConsecutiveMacros: true
 #   int cccccccc;
 AlignEscapedNewlines: Left
 
-# Align operands of binary and ternary expressions
-# int aaa = bbbbbbbbbbb +
-#           cccccc;
-AlignOperands: true
+# Don't enforce alignment after linebreaks and instead
+# rely on the ContinuationIndentWidth value.
+AlignOperands: false
 
 # Don't align trailing comments
 # int a; // Comment a
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-12  1:49   ` [PATCH v3 1/3] clang-format: re-adjust line break penalties Karthik Nayak
@ 2024-10-14  9:08     ` Toon Claes
  2024-10-14 21:14       ` Taylor Blau
  2024-10-15 11:20       ` karthik nayak
  2024-10-14 20:59     ` Kyle Lippincott
  1 sibling, 2 replies; 41+ messages in thread
From: Toon Claes @ 2024-10-14  9:08 UTC (permalink / raw)
  To: Karthik Nayak, karthik.188
  Cc: git, gitster, jltobler, johannes.schindelin, spectral

Karthik Nayak <karthik.188@gmail.com> writes:

[snip]

> This avoids weird formatting like:
>
>    static const struct strbuf *
>           a_really_really_large_function_name(struct strbuf resolved,
>           const char *path, int flags)
>
> or
>
>    static const struct strbuf *a_really_really_large_function_name(
>    	    struct strbuf resolved, const char *path, int flags)
>
> to instead have something more readable like:
>
>    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
>           const char *path, int flags)
>
> This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
> and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
> few characters above the 80 column limit to make code more readable.

I'm really liking the idea of penalties, but I feel we're relying too
much on guestimation of these values. What do you think about adding
example files to our codebase? Having concrete examples at hand will
allow us to tweak the values in the future, while preserving behavior
for existing cases. Or when we decide to change them, we understand
what and when.

Now, I'm not sure where to put such files. I think I would suggest
something like t/style-lint or t/clang-format. Anyway, for our tooling
it doesn't seem to matter, because both `make style` and
`ci/run-style-check.sh` pick up all .c and .h files anywhere in the
source tree. Adding a README to that directory will help people
understand why the files are there.

--
Toon

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-12  1:49   ` [PATCH v3 1/3] clang-format: re-adjust line break penalties Karthik Nayak
  2024-10-14  9:08     ` Toon Claes
@ 2024-10-14 20:59     ` Kyle Lippincott
  2024-10-15 12:37       ` karthik nayak
  1 sibling, 1 reply; 41+ messages in thread
From: Kyle Lippincott @ 2024-10-14 20:59 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, gitster, jltobler, toon, johannes.schindelin

On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
> adjusted the line break penalties to really fine tune what we care about
> while doing line breaks. Modify some of those to be more inline with
> what we care about in the Git project now.
>
> We need to understand that the values set to penalties in
> '.clang-format' are relative to each other and do not hold any absolute
> value. The penalty arguments take an 'Unsigned' value, so we have some
> liberty over the values we can set.
>
> First, in that commit, we decided, that under no circumstances do we
> want to exceed 80 characters. This seems a bit too strict. We do
> overshoot this limit from time to time to prioritize readability.

I think that attempting to get the weights right so as to avoid cases
where there was an intentional affordance for readability is going to
be essentially impossible. Areas where there's an intentional
disregard for the clang-format-generated formatting should disable the
formatter for that line/region, instead of trying to find a way to
adjust the rules to produce something that's going to end up being
context dependent.

Example: In ref-filter.c, there's 13 lines when initializing the
`valid_atom` array that are >80 characters, and 20 lines that are >80
columns (when using 8-space tabs). Line breaking in that block of code
may be undesirable, so just disable clang-format there. I don't think
there's a consistent set of penalties you could establish that would
handle that well without mishandling some other section of code.

It's also not clear what the reason for the overshoot is in many cases.
- difference between "80 characters" and "80 columns"?
    - (1394 >80char lines in *.{h,c}, 4849 >80col lines in the same files)
- intentional for readability?
- refactorings pushed originally compliant lines out of compliance?
- no one caught it and it was just added without any intentional decision?

> So
> let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
> that we add a penalty of 10 for each character that exceeds the column
> limit. By itself this is enough to restrict to column limit. Tuning
> other penalties in relation to this is what is important.
>
> The penalty `PenaltyBreakAssignment` talks about the penalty for
> breaking an assignment operator on to the next line. In our project, we
> are okay with this, so giving a value of 5, which is below the value for
> 'PenaltyExcessCharacter' ensures that in the end, even 1 character over
> the column limit is not worth keeping an assignment on the same line.
>
> Similarly set the penalty for breaking before the first call parameter
> 'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
> comments 'PenaltyBreakComment' and the penalty for breaking string
> literals 'PenaltyBreakString' also to 5.
>
> Finally, we really care about not breaking the return type into its own
> line and we really care about not breaking before an open parenthesis.
> This avoids weird formatting like:
>
>    static const struct strbuf *
>           a_really_really_large_function_name(struct strbuf resolved,
>           const char *path, int flags)

Is this how it'd be indented without the penalties, or would it do
this, with the function name indented the same amount as the return
type (which is, in C, probably going to be the 0th column most times):

static const struct strbuf *
a_really_really_large_function_name(struct strbuf resolved,
        const char *path, int flags)

>
> or
>
>    static const struct strbuf *a_really_really_large_function_name(
>             struct strbuf resolved, const char *path, int flags)

Personal opinion: I prefer this over the version that has a single
argument on the first line. My preference for reading functions is:

return_type func_name(arg1, arg2,
                      arg3, arg4,
                      arg5, arg6, ...);

Or

return_type func_name(
        arg1, arg2, arg3, arg4,
        arg5, arg6, ...);

or, in some cases, putting every argument on their own line (typically
when the majority of the arguments are already on their own line, not
having one "hiding" somewhere is preferable, but at this point if
that's not what my formatter does, I don't fight it).

For functions that accept an obvious first parameter, such as
`strbuf_add`, maybe having the first parameter on the first line is
acceptable/desirable, since it's "obvious" what it is/does. But for
many functions that's not the case, and needing to read the end of the
first line, potentially beyond the 80th column, feels weird.

>
> to instead have something more readable like:
>
>    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
>           const char *path, int flags)
>
> This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
> and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
> few characters above the 80 column limit to make code more readable.

A few examples, such as by formatting the code using the current rules
(since much of the codebase does not currently comply), and then
changing the penalties and seeing what changes, might be nice?


>
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
>  .clang-format | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/.clang-format b/.clang-format
> index 41969eca4b..66a2360ae5 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -209,13 +209,14 @@ KeepEmptyLinesAtTheStartOfBlocks: false
>
>  # Penalties
>  # This decides what order things should be done if a line is too long
> -PenaltyBreakAssignment: 10
> -PenaltyBreakBeforeFirstCallParameter: 30
> -PenaltyBreakComment: 10
> +PenaltyBreakAssignment: 5
> +PenaltyBreakBeforeFirstCallParameter: 5
> +PenaltyBreakComment: 5
>  PenaltyBreakFirstLessLess: 0
> -PenaltyBreakString: 10
> -PenaltyExcessCharacter: 100
> -PenaltyReturnTypeOnItsOwnLine: 60
> +PenaltyBreakOpenParenthesis: 300

How does this interact with PenaltyBreakBeforeFirstCallParameter? Does
one override the other?

> +PenaltyBreakString: 5
> +PenaltyExcessCharacter: 10
> +PenaltyReturnTypeOnItsOwnLine: 300
>
>  # Don't sort #include's
>  SortIncludes: false
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/3] clang-format: align consecutive macro definitions
  2024-10-12  1:49   ` [PATCH v3 2/3] clang-format: align consecutive macro definitions Karthik Nayak
@ 2024-10-14 21:12     ` Kyle Lippincott
  2024-10-15  7:57       ` karthik nayak
  0 siblings, 1 reply; 41+ messages in thread
From: Kyle Lippincott @ 2024-10-14 21:12 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, gitster, jltobler, toon, johannes.schindelin

On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> We generally align consecutive macro definitions for better readability:
>
>   #define OUTPUT_ANNOTATE_COMPAT      (1U<<0)
>   #define OUTPUT_LONG_OBJECT_NAME     (1U<<1)
>   #define OUTPUT_RAW_TIMESTAMP        (1U<<2)
>   #define OUTPUT_PORCELAIN            (1U<<3)

I like this change, thanks. Is there a way of apply clang-format for
*only* one rule/aspect? i.e. can we apply *only* this, and preserve
every other line? At first glance, I don't see a way of doing it. If
there was, I might recommend a whole series just to applying these
changes, but with how out of compliance much of the codebase is today,
that's not going to be feasible; we'd need to format it in a way that
we might not want (the current style), and then fix it, and that seems
counterproductive.

>
> over
>
>   #define OUTPUT_ANNOTATE_COMPAT (1U<<0)
>   #define OUTPUT_LONG_OBJECT_NAME (1U<<1)
>   #define OUTPUT_RAW_TIMESTAMP (1U<<2)
>   #define OUTPUT_PORCELAIN (1U<<3)
>
> So let's add the rule in clang-format to follow this.
>
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
>  .clang-format | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/.clang-format b/.clang-format
> index 66a2360ae5..9547fe1b77 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -32,6 +32,9 @@ AlignConsecutiveAssignments: false
>  # double b = 3.14;
>  AlignConsecutiveDeclarations: false
>
> +# Align consecutive macro definitions.
> +AlignConsecutiveMacros: true
> +
>  # Align escaped newlines as far left as possible
>  # #define A   \
>  #   int aaaa; \
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-14  9:08     ` Toon Claes
@ 2024-10-14 21:14       ` Taylor Blau
  2024-10-15 11:35         ` karthik nayak
  2024-10-15 11:20       ` karthik nayak
  1 sibling, 1 reply; 41+ messages in thread
From: Taylor Blau @ 2024-10-14 21:14 UTC (permalink / raw)
  To: Toon Claes
  Cc: Karthik Nayak, git, gitster, jltobler, johannes.schindelin,
	spectral

On Mon, Oct 14, 2024 at 11:08:29AM +0200, Toon Claes wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
> [snip]
>
> > This avoids weird formatting like:
> >
> >    static const struct strbuf *
> >           a_really_really_large_function_name(struct strbuf resolved,
> >           const char *path, int flags)
> >
> > or
> >
> >    static const struct strbuf *a_really_really_large_function_name(
> >    	    struct strbuf resolved, const char *path, int flags)
> >
> > to instead have something more readable like:
> >
> >    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
> >           const char *path, int flags)
> >
> > This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
> > and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
> > few characters above the 80 column limit to make code more readable.
>
> I'm really liking the idea of penalties, but I feel we're relying too
> much on guestimation of these values. What do you think about adding
> example files to our codebase? Having concrete examples at hand will
> allow us to tweak the values in the future, while preserving behavior
> for existing cases. Or when we decide to change them, we understand
> what and when.

I am not sure I see it the same way.

I might just be ill-informed or not experienced with these clang-format
rules, but having these penalties be defined as such makes it difficult
to reason about what lines will and won't be re-wrapped as a result of
running the formatter.

What is the purpose of these penalties?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] clang-format: don't align expressions after linebreaks
  2024-10-12  1:49   ` [PATCH v3 3/3] clang-format: don't align expressions after linebreaks Karthik Nayak
@ 2024-10-14 21:23     ` Kyle Lippincott
  2024-10-15 11:17       ` karthik nayak
  0 siblings, 1 reply; 41+ messages in thread
From: Kyle Lippincott @ 2024-10-14 21:23 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, gitster, jltobler, toon, johannes.schindelin

On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> We enforce alignment of expressions after linebreaks. Which means for
> code such as
>
>     return a || b;
>
> it will expect:
>
>    return a ||
>           b;
>
> we instead want 'b' to be indent with tabs, which is already done by the
> 'ContinuationIndentWidth' variable.

Why do we want `b` to be indented by 8 columns instead of aligned? I
think this is harder to read:

int some_int_variable = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
        bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;

Of course, this is even better, if it fits in 80 cols:

int some_int_variable =
        aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;

> So let's explicitly set
> 'AlignOperands' to false.
>
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
>  .clang-format | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/.clang-format b/.clang-format
> index 9547fe1b77..b48e7813e4 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -42,10 +42,9 @@ AlignConsecutiveMacros: true
>  #   int cccccccc;
>  AlignEscapedNewlines: Left
>
> -# Align operands of binary and ternary expressions
> -# int aaa = bbbbbbbbbbb +
> -#           cccccc;
> -AlignOperands: true
> +# Don't enforce alignment after linebreaks and instead
> +# rely on the ContinuationIndentWidth value.
> +AlignOperands: false
>
>  # Don't align trailing comments
>  # int a; // Comment a
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 2/3] clang-format: align consecutive macro definitions
  2024-10-14 21:12     ` Kyle Lippincott
@ 2024-10-15  7:57       ` karthik nayak
  0 siblings, 0 replies; 41+ messages in thread
From: karthik nayak @ 2024-10-15  7:57 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: git, gitster, jltobler, toon, johannes.schindelin

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

Kyle Lippincott <spectral@google.com> writes:

> On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>>
>> We generally align consecutive macro definitions for better readability:
>>
>>   #define OUTPUT_ANNOTATE_COMPAT      (1U<<0)
>>   #define OUTPUT_LONG_OBJECT_NAME     (1U<<1)
>>   #define OUTPUT_RAW_TIMESTAMP        (1U<<2)
>>   #define OUTPUT_PORCELAIN            (1U<<3)
>
> I like this change, thanks. Is there a way of apply clang-format for
> *only* one rule/aspect? i.e. can we apply *only* this, and preserve
> every other line? At first glance, I don't see a way of doing it. If
> there was, I might recommend a whole series just to applying these
> changes, but with how out of compliance much of the codebase is today,
> that's not going to be feasible; we'd need to format it in a way that
> we might not want (the current style), and then fix it, and that seems
> counterproductive.
>

I think we can apply a single rule by specifying the rule over the CLI.

Overall I think its best to take things iteratively. Also because
'clang-format' is a bit rough around the edges and we might discover
some other changes needed.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 3/3] clang-format: don't align expressions after linebreaks
  2024-10-14 21:23     ` Kyle Lippincott
@ 2024-10-15 11:17       ` karthik nayak
  0 siblings, 0 replies; 41+ messages in thread
From: karthik nayak @ 2024-10-15 11:17 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: git, gitster, jltobler, toon, johannes.schindelin

[-- Attachment #1: Type: text/plain, Size: 1374 bytes --]

Kyle Lippincott <spectral@google.com> writes:

> On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>>
>> We enforce alignment of expressions after linebreaks. Which means for
>> code such as
>>
>>     return a || b;
>>
>> it will expect:
>>
>>    return a ||
>>           b;
>>
>> we instead want 'b' to be indent with tabs, which is already done by the
>> 'ContinuationIndentWidth' variable.
>
> Why do we want `b` to be indented by 8 columns instead of aligned? I
> think this is harder to read:
>
> int some_int_variable = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
>         bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
>

The reason I added this is because by default editors will follow the
'.editorconfig' and follow the 'tab_width = 8' rule when there is a line
break.

This means more often than not, most patches don't follow this rule. We
don't enforce clang-format at this point so it makes more sense to align
the rule to what everyone is doing. The goal being that once we have a
good set of base rules with less false positives, we can start
enforcing.

> Of course, this is even better, if it fits in 80 cols:
>
> int some_int_variable =
>         aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
>

I'm sure we can tweak this with penalties ;) But I'd say this is
something we can tune later.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-14  9:08     ` Toon Claes
  2024-10-14 21:14       ` Taylor Blau
@ 2024-10-15 11:20       ` karthik nayak
  1 sibling, 0 replies; 41+ messages in thread
From: karthik nayak @ 2024-10-15 11:20 UTC (permalink / raw)
  To: Toon Claes; +Cc: git, gitster, jltobler, johannes.schindelin, spectral

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

Toon Claes <toon@iotcl.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
> [snip]
>
>> This avoids weird formatting like:
>>
>>    static const struct strbuf *
>>           a_really_really_large_function_name(struct strbuf resolved,
>>           const char *path, int flags)
>>
>> or
>>
>>    static const struct strbuf *a_really_really_large_function_name(
>>    	    struct strbuf resolved, const char *path, int flags)
>>
>> to instead have something more readable like:
>>
>>    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
>>           const char *path, int flags)
>>
>> This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
>> and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
>> few characters above the 80 column limit to make code more readable.
>
> I'm really liking the idea of penalties, but I feel we're relying too
> much on guestimation of these values. What do you think about adding

That is true indeed. There is a bit of guestimation done here, I had to
try various values to find the ones which worked well. I wish there was
a more formal way to do this...

> example files to our codebase? Having concrete examples at hand will
> allow us to tweak the values in the future, while preserving behavior
> for existing cases. Or when we decide to change them, we understand
> what and when.
>
> Now, I'm not sure where to put such files. I think I would suggest
> something like t/style-lint or t/clang-format. Anyway, for our tooling
> it doesn't seem to matter, because both `make style` and
> `ci/run-style-check.sh` pick up all .c and .h files anywhere in the
> source tree. Adding a README to that directory will help people
> understand why the files are there.
>

I'm not too keen on adding examples, for the mere facts that:
1. They will be outdated each time we change rules.
2. The commit message already has the information around each rule.

Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-14 21:14       ` Taylor Blau
@ 2024-10-15 11:35         ` karthik nayak
  0 siblings, 0 replies; 41+ messages in thread
From: karthik nayak @ 2024-10-15 11:35 UTC (permalink / raw)
  To: Taylor Blau, Toon Claes
  Cc: git, gitster, jltobler, johannes.schindelin, spectral

[-- Attachment #1: Type: text/plain, Size: 1938 bytes --]

Taylor Blau <me@ttaylorr.com> writes:

> On Mon, Oct 14, 2024 at 11:08:29AM +0200, Toon Claes wrote:
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>
>> [snip]
>>
>> > This avoids weird formatting like:
>> >
>> >    static const struct strbuf *
>> >           a_really_really_large_function_name(struct strbuf resolved,
>> >           const char *path, int flags)
>> >
>> > or
>> >
>> >    static const struct strbuf *a_really_really_large_function_name(
>> >    	    struct strbuf resolved, const char *path, int flags)
>> >
>> > to instead have something more readable like:
>> >
>> >    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
>> >           const char *path, int flags)
>> >
>> > This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
>> > and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
>> > few characters above the 80 column limit to make code more readable.
>>
>> I'm really liking the idea of penalties, but I feel we're relying too
>> much on guestimation of these values. What do you think about adding
>> example files to our codebase? Having concrete examples at hand will
>> allow us to tweak the values in the future, while preserving behavior
>> for existing cases. Or when we decide to change them, we understand
>> what and when.
>
> I am not sure I see it the same way.
>
> I might just be ill-informed or not experienced with these clang-format
> rules, but having these penalties be defined as such makes it difficult
> to reason about what lines will and won't be re-wrapped as a result of
> running the formatter.
>
> What is the purpose of these penalties?
>

We have a column limit set in our clang-format 'ColumnLimit: 80', this
dictates when line breaks should occur in our code.

The penalties are options through which we can influence this decision
[1].

> Thanks,
> Taylor

[1]: https://stackoverflow.com/a/27608250

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-14 20:59     ` Kyle Lippincott
@ 2024-10-15 12:37       ` karthik nayak
  2024-10-16  1:38         ` Kyle Lippincott
  0 siblings, 1 reply; 41+ messages in thread
From: karthik nayak @ 2024-10-15 12:37 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: git, gitster, jltobler, toon, johannes.schindelin

[-- Attachment #1: Type: text/plain, Size: 8342 bytes --]

Kyle Lippincott <spectral@google.com> writes:

> On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>>
>> In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
>> adjusted the line break penalties to really fine tune what we care about
>> while doing line breaks. Modify some of those to be more inline with
>> what we care about in the Git project now.
>>
>> We need to understand that the values set to penalties in
>> '.clang-format' are relative to each other and do not hold any absolute
>> value. The penalty arguments take an 'Unsigned' value, so we have some
>> liberty over the values we can set.
>>
>> First, in that commit, we decided, that under no circumstances do we
>> want to exceed 80 characters. This seems a bit too strict. We do
>> overshoot this limit from time to time to prioritize readability.
>
> I think that attempting to get the weights right so as to avoid cases
> where there was an intentional affordance for readability is going to
> be essentially impossible. Areas where there's an intentional
> disregard for the clang-format-generated formatting should disable the
> formatter for that line/region, instead of trying to find a way to
> adjust the rules to produce something that's going to end up being
> context dependent.
>

To some extent I agree. But the issue is that clang-format is still not
enforced within the code base. So expecting users to add:
    // clang-format off
will not hold, at least for _now_.

So the next best thing we can do is to get the format rules as close as
we can to the current styling, so the actual errors thrown by the CI job
is something we can look at without containing too many false positives.

> Example: In ref-filter.c, there's 13 lines when initializing the
> `valid_atom` array that are >80 characters, and 20 lines that are >80
> columns (when using 8-space tabs). Line breaking in that block of code
> may be undesirable, so just disable clang-format there. I don't think
> there's a consistent set of penalties you could establish that would
> handle that well without mishandling some other section of code.

While true, we can quantify if it is better or not:

      ❯ ci/run-style-check.sh @~50 | wc -l
      4718 (master)

      ❯ ci/run-style-check.sh @~53 | wc -l
      4475 (with these patches)

And looking through the other changes, those look like violations which
should have been fixed.

> It's also not clear what the reason for the overshoot is in many cases.
> - difference between "80 characters" and "80 columns"?
>     - (1394 >80char lines in *.{h,c}, 4849 >80col lines in the same files)
> - intentional for readability?
> - refactorings pushed originally compliant lines out of compliance?
> - no one caught it and it was just added without any intentional decision?
>

I agree with your inference here, but I'm not sure there is a smooth way
to have this information. Either we go full in and say we enable the
formatting and every patch must conform to it, or we simply keep the
clang-format as a warning system. Currently we do neither. I'd say we
should be in a state to reach the latter and then we can gradually think
of how to move to the former.

>> So
>> let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
>> that we add a penalty of 10 for each character that exceeds the column
>> limit. By itself this is enough to restrict to column limit. Tuning
>> other penalties in relation to this is what is important.
>>
>> The penalty `PenaltyBreakAssignment` talks about the penalty for
>> breaking an assignment operator on to the next line. In our project, we
>> are okay with this, so giving a value of 5, which is below the value for
>> 'PenaltyExcessCharacter' ensures that in the end, even 1 character over
>> the column limit is not worth keeping an assignment on the same line.
>>
>> Similarly set the penalty for breaking before the first call parameter
>> 'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
>> comments 'PenaltyBreakComment' and the penalty for breaking string
>> literals 'PenaltyBreakString' also to 5.
>>
>> Finally, we really care about not breaking the return type into its own
>> line and we really care about not breaking before an open parenthesis.
>> This avoids weird formatting like:
>>
>>    static const struct strbuf *
>>           a_really_really_large_function_name(struct strbuf resolved,
>>           const char *path, int flags)
>
> Is this how it'd be indented without the penalties, or would it do
> this, with the function name indented the same amount as the return
> type (which is, in C, probably going to be the 0th column most times):
>
> static const struct strbuf *
> a_really_really_large_function_name(struct strbuf resolved,
>         const char *path, int flags)
>

It will be indented, so not the 0th column.

>>
>> or
>>
>>    static const struct strbuf *a_really_really_large_function_name(
>>             struct strbuf resolved, const char *path, int flags)
>
> Personal opinion: I prefer this over the version that has a single
> argument on the first line. My preference for reading functions is:
>
> return_type func_name(arg1, arg2,
>                       arg3, arg4,
>                       arg5, arg6, ...);
>
> Or
>
> return_type func_name(
>         arg1, arg2, arg3, arg4,
>         arg5, arg6, ...);
>

I'm mostly basing my changes on the current state of the 'clang-format'
and our code base. I'm happy to change it if everyone agrees on this :)

> or, in some cases, putting every argument on their own line (typically
> when the majority of the arguments are already on their own line, not
> having one "hiding" somewhere is preferable, but at this point if
> that's not what my formatter does, I don't fight it).
>
> For functions that accept an obvious first parameter, such as
> `strbuf_add`, maybe having the first parameter on the first line is
> acceptable/desirable, since it's "obvious" what it is/does. But for
> many functions that's not the case, and needing to read the end of the
> first line, potentially beyond the 80th column, feels weird.
>
>>
>> to instead have something more readable like:
>>
>>    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
>>           const char *path, int flags)
>>
>> This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
>> and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
>> few characters above the 80 column limit to make code more readable.
>
> A few examples, such as by formatting the code using the current rules
> (since much of the codebase does not currently comply), and then
> changing the penalties and seeing what changes, might be nice?
>

You mean apart from the example above?

>
>>
>> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
>> ---
>>  .clang-format | 13 +++++++------
>>  1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/.clang-format b/.clang-format
>> index 41969eca4b..66a2360ae5 100644
>> --- a/.clang-format
>> +++ b/.clang-format
>> @@ -209,13 +209,14 @@ KeepEmptyLinesAtTheStartOfBlocks: false
>>
>>  # Penalties
>>  # This decides what order things should be done if a line is too long
>> -PenaltyBreakAssignment: 10
>> -PenaltyBreakBeforeFirstCallParameter: 30
>> -PenaltyBreakComment: 10
>> +PenaltyBreakAssignment: 5
>> +PenaltyBreakBeforeFirstCallParameter: 5
>> +PenaltyBreakComment: 5
>>  PenaltyBreakFirstLessLess: 0
>> -PenaltyBreakString: 10
>> -PenaltyExcessCharacter: 100
>> -PenaltyReturnTypeOnItsOwnLine: 60
>> +PenaltyBreakOpenParenthesis: 300
>
> How does this interact with PenaltyBreakBeforeFirstCallParameter? Does
> one override the other?
>

From my understanding PenaltyBreakOpenParenthesis seems to apply more
generally and is the more preferred value.

[1]: https://github.com/llvm/llvm-project/blob/a4367d2d136420f562f64e7731b9393fb609f3fc/clang/lib/Format/TokenAnnotator.cpp#L4322

>> +PenaltyBreakString: 5
>> +PenaltyExcessCharacter: 10
>> +PenaltyReturnTypeOnItsOwnLine: 300
>>
>>  # Don't sort #include's
>>  SortIncludes: false
>> --
>> 2.47.0
>>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-15 12:37       ` karthik nayak
@ 2024-10-16  1:38         ` Kyle Lippincott
  2024-10-16 21:17           ` karthik nayak
  0 siblings, 1 reply; 41+ messages in thread
From: Kyle Lippincott @ 2024-10-16  1:38 UTC (permalink / raw)
  To: karthik nayak; +Cc: git, gitster, jltobler, toon, johannes.schindelin

On Tue, Oct 15, 2024 at 5:37 AM karthik nayak <karthik.188@gmail.com> wrote:
>
> Kyle Lippincott <spectral@google.com> writes:
>
> > On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
> >>
> >> In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
> >> adjusted the line break penalties to really fine tune what we care about
> >> while doing line breaks. Modify some of those to be more inline with
> >> what we care about in the Git project now.
> >>
> >> We need to understand that the values set to penalties in
> >> '.clang-format' are relative to each other and do not hold any absolute
> >> value. The penalty arguments take an 'Unsigned' value, so we have some
> >> liberty over the values we can set.
> >>
> >> First, in that commit, we decided, that under no circumstances do we
> >> want to exceed 80 characters. This seems a bit too strict. We do
> >> overshoot this limit from time to time to prioritize readability.
> >
> > I think that attempting to get the weights right so as to avoid cases
> > where there was an intentional affordance for readability is going to
> > be essentially impossible. Areas where there's an intentional
> > disregard for the clang-format-generated formatting should disable the
> > formatter for that line/region, instead of trying to find a way to
> > adjust the rules to produce something that's going to end up being
> > context dependent.
> >
>
> To some extent I agree. But the issue is that clang-format is still not
> enforced within the code base. So expecting users to add:
>     // clang-format off
> will not hold, at least for _now_.
>
> So the next best thing we can do is to get the format rules as close as
> we can to the current styling, so the actual errors thrown by the CI job
> is something we can look at without containing too many false positives.
>
> > Example: In ref-filter.c, there's 13 lines when initializing the
> > `valid_atom` array that are >80 characters, and 20 lines that are >80
> > columns (when using 8-space tabs). Line breaking in that block of code
> > may be undesirable, so just disable clang-format there. I don't think
> > there's a consistent set of penalties you could establish that would
> > handle that well without mishandling some other section of code.
>
> While true, we can quantify if it is better or not:
>
>       ❯ ci/run-style-check.sh @~50 | wc -l
>       4718 (master)
>
>       ❯ ci/run-style-check.sh @~53 | wc -l
>       4475 (with these patches)
>
> And looking through the other changes, those look like violations which
> should have been fixed.
>
> > It's also not clear what the reason for the overshoot is in many cases.
> > - difference between "80 characters" and "80 columns"?
> >     - (1394 >80char lines in *.{h,c}, 4849 >80col lines in the same files)
> > - intentional for readability?
> > - refactorings pushed originally compliant lines out of compliance?
> > - no one caught it and it was just added without any intentional decision?
> >
>
> I agree with your inference here, but I'm not sure there is a smooth way
> to have this information. Either we go full in and say we enable the
> formatting and every patch must conform to it, or we simply keep the
> clang-format as a warning system. Currently we do neither. I'd say we
> should be in a state to reach the latter and then we can gradually think
> of how to move to the former.
>
> >> So
> >> let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
> >> that we add a penalty of 10 for each character that exceeds the column
> >> limit. By itself this is enough to restrict to column limit. Tuning
> >> other penalties in relation to this is what is important.
> >>
> >> The penalty `PenaltyBreakAssignment` talks about the penalty for
> >> breaking an assignment operator on to the next line. In our project, we
> >> are okay with this, so giving a value of 5, which is below the value for
> >> 'PenaltyExcessCharacter' ensures that in the end, even 1 character over
> >> the column limit is not worth keeping an assignment on the same line.
> >>
> >> Similarly set the penalty for breaking before the first call parameter
> >> 'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
> >> comments 'PenaltyBreakComment' and the penalty for breaking string
> >> literals 'PenaltyBreakString' also to 5.
> >>
> >> Finally, we really care about not breaking the return type into its own
> >> line and we really care about not breaking before an open parenthesis.
> >> This avoids weird formatting like:
> >>
> >>    static const struct strbuf *
> >>           a_really_really_large_function_name(struct strbuf resolved,
> >>           const char *path, int flags)
> >
> > Is this how it'd be indented without the penalties, or would it do
> > this, with the function name indented the same amount as the return
> > type (which is, in C, probably going to be the 0th column most times):
> >
> > static const struct strbuf *
> > a_really_really_large_function_name(struct strbuf resolved,
> >         const char *path, int flags)
> >
>
> It will be indented, so not the 0th column.

I'm not getting that behavior when I try it. Is this only indented
with your updated penalties?

Currently on ef8ce8f3d4344fd3af049c17eeba5cd20d98b69f, git status is
clean, added this line (no line breaks, just in case my email client
makes a mess of this) to top of path.h (chosen arbitrarily):

static const struct strbuf *a_really_really_large_function_name(struct
strbuf resolved, const char *path, int flags);

and ran `clang-format path.h | head -n20` and got this (where `int
flags` is indented to align with the opening `(`, but tabs cause
problems yet again):

static const struct strbuf *
a_really_really_large_function_name(struct strbuf resolved, const char *path,
    int flags);

`clang-format --version` shows it's a google-internal build, but it
still respects the .clang-format file, so this shouldn't matter? I'm
assuming it's a relatively recent (within the past 1 month) commit
that it's based off of.

>
> >>
> >> or
> >>
> >>    static const struct strbuf *a_really_really_large_function_name(
> >>             struct strbuf resolved, const char *path, int flags)
> >
> > Personal opinion: I prefer this over the version that has a single
> > argument on the first line. My preference for reading functions is:
> >
> > return_type func_name(arg1, arg2,
> >                       arg3, arg4,
> >                       arg5, arg6, ...);
> >
> > Or
> >
> > return_type func_name(
> >         arg1, arg2, arg3, arg4,
> >         arg5, arg6, ...);
> >
>
> I'm mostly basing my changes on the current state of the 'clang-format'
> and our code base. I'm happy to change it if everyone agrees on this :)

I'm wondering if tabs have caused some confusion here... [thought
continued below]

>
> > or, in some cases, putting every argument on their own line (typically
> > when the majority of the arguments are already on their own line, not
> > having one "hiding" somewhere is preferable, but at this point if
> > that's not what my formatter does, I don't fight it).
> >
> > For functions that accept an obvious first parameter, such as
> > `strbuf_add`, maybe having the first parameter on the first line is
> > acceptable/desirable, since it's "obvious" what it is/does. But for
> > many functions that's not the case, and needing to read the end of the
> > first line, potentially beyond the 80th column, feels weird.
> >
> >>
> >> to instead have something more readable like:
> >>
> >>    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
> >>           const char *path, int flags)

Does this have `const char *path` aligned with the opening `(`? if so,
then that matches my first example and I'm fine with it. If this is
truly "first argument immediately after (, other arguments indented
one indentation level", then my original comment stands: I don't find
this readable at all, and I don't see evidence of this being an
acceptable format according to our CodingGuidelines document.

I also don't understand how the penalties would produce t


> >>
> >> This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
> >> and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
> >> few characters above the 80 column limit to make code more readable.
> >
> > A few examples, such as by formatting the code using the current rules
> > (since much of the codebase does not currently comply), and then
> > changing the penalties and seeing what changes, might be nice?
> >
>
> You mean apart from the example above?

The example above is hypothetical, but I think I was thinking about
this incorrectly. Our existing codebase isn't formatted with
clang-format, and formatting it first and then adjusting the penalties
doesn't really provide much useful information.

Setting the penalties as you have them in this patch, and running it
on a copy of the line you have there, produces this for me:

static const struct strbuf *
a_really_really_large_function_name(struct strbuf resolved, const char *path,
                                    int flags);

The 80 column limit is still _strictly_ adhered to, it won't even go
over by 1 character:

static const struct strbuf *
a_really_really_large_function_named_Ed(struct strbuf resolved,
                                        const char *path, int flags);

(note: switched tabs to spaces because tabs are difficult to use
during discussions like this)

Specifically:
- 80 column hard limit applied
- return type on its own line
- continuation arguments are aligned on the next line (which is
expected since we have AlignAfterOpenBracket set).

>
> >
> >>
> >> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> >> ---
> >>  .clang-format | 13 +++++++------
> >>  1 file changed, 7 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/.clang-format b/.clang-format
> >> index 41969eca4b..66a2360ae5 100644
> >> --- a/.clang-format
> >> +++ b/.clang-format
> >> @@ -209,13 +209,14 @@ KeepEmptyLinesAtTheStartOfBlocks: false
> >>
> >>  # Penalties
> >>  # This decides what order things should be done if a line is too long
> >> -PenaltyBreakAssignment: 10
> >> -PenaltyBreakBeforeFirstCallParameter: 30
> >> -PenaltyBreakComment: 10
> >> +PenaltyBreakAssignment: 5
> >> +PenaltyBreakBeforeFirstCallParameter: 5
> >> +PenaltyBreakComment: 5
> >>  PenaltyBreakFirstLessLess: 0
> >> -PenaltyBreakString: 10
> >> -PenaltyExcessCharacter: 100
> >> -PenaltyReturnTypeOnItsOwnLine: 60
> >> +PenaltyBreakOpenParenthesis: 300
> >
> > How does this interact with PenaltyBreakBeforeFirstCallParameter? Does
> > one override the other?
> >
>
> From my understanding PenaltyBreakOpenParenthesis seems to apply more
> generally and is the more preferred value.
>
> [1]: https://github.com/llvm/llvm-project/blob/a4367d2d136420f562f64e7731b9393fb609f3fc/clang/lib/Format/TokenAnnotator.cpp#L4322
>
> >> +PenaltyBreakString: 5
> >> +PenaltyExcessCharacter: 10
> >> +PenaltyReturnTypeOnItsOwnLine: 300
> >>
> >>  # Don't sort #include's
> >>  SortIncludes: false
> >> --
> >> 2.47.0
> >>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties
  2024-10-16  1:38         ` Kyle Lippincott
@ 2024-10-16 21:17           ` karthik nayak
  0 siblings, 0 replies; 41+ messages in thread
From: karthik nayak @ 2024-10-16 21:17 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: git, gitster, jltobler, toon, johannes.schindelin

[-- Attachment #1: Type: text/plain, Size: 11624 bytes --]

Kyle Lippincott <spectral@google.com> writes:

> On Tue, Oct 15, 2024 at 5:37 AM karthik nayak <karthik.188@gmail.com> wrote:
>>
>> Kyle Lippincott <spectral@google.com> writes:
>>
>> > On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@gmail.com> wrote:
>> >>
>> >> In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
>> >> adjusted the line break penalties to really fine tune what we care about
>> >> while doing line breaks. Modify some of those to be more inline with
>> >> what we care about in the Git project now.
>> >>
>> >> We need to understand that the values set to penalties in
>> >> '.clang-format' are relative to each other and do not hold any absolute
>> >> value. The penalty arguments take an 'Unsigned' value, so we have some
>> >> liberty over the values we can set.
>> >>
>> >> First, in that commit, we decided, that under no circumstances do we
>> >> want to exceed 80 characters. This seems a bit too strict. We do
>> >> overshoot this limit from time to time to prioritize readability.
>> >
>> > I think that attempting to get the weights right so as to avoid cases
>> > where there was an intentional affordance for readability is going to
>> > be essentially impossible. Areas where there's an intentional
>> > disregard for the clang-format-generated formatting should disable the
>> > formatter for that line/region, instead of trying to find a way to
>> > adjust the rules to produce something that's going to end up being
>> > context dependent.
>> >
>>
>> To some extent I agree. But the issue is that clang-format is still not
>> enforced within the code base. So expecting users to add:
>>     // clang-format off
>> will not hold, at least for _now_.
>>
>> So the next best thing we can do is to get the format rules as close as
>> we can to the current styling, so the actual errors thrown by the CI job
>> is something we can look at without containing too many false positives.
>>
>> > Example: In ref-filter.c, there's 13 lines when initializing the
>> > `valid_atom` array that are >80 characters, and 20 lines that are >80
>> > columns (when using 8-space tabs). Line breaking in that block of code
>> > may be undesirable, so just disable clang-format there. I don't think
>> > there's a consistent set of penalties you could establish that would
>> > handle that well without mishandling some other section of code.
>>
>> While true, we can quantify if it is better or not:
>>
>>       ❯ ci/run-style-check.sh @~50 | wc -l
>>       4718 (master)
>>
>>       ❯ ci/run-style-check.sh @~53 | wc -l
>>       4475 (with these patches)
>>
>> And looking through the other changes, those look like violations which
>> should have been fixed.
>>
>> > It's also not clear what the reason for the overshoot is in many cases.
>> > - difference between "80 characters" and "80 columns"?
>> >     - (1394 >80char lines in *.{h,c}, 4849 >80col lines in the same files)
>> > - intentional for readability?
>> > - refactorings pushed originally compliant lines out of compliance?
>> > - no one caught it and it was just added without any intentional decision?
>> >
>>
>> I agree with your inference here, but I'm not sure there is a smooth way
>> to have this information. Either we go full in and say we enable the
>> formatting and every patch must conform to it, or we simply keep the
>> clang-format as a warning system. Currently we do neither. I'd say we
>> should be in a state to reach the latter and then we can gradually think
>> of how to move to the former.
>>
>> >> So
>> >> let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
>> >> that we add a penalty of 10 for each character that exceeds the column
>> >> limit. By itself this is enough to restrict to column limit. Tuning
>> >> other penalties in relation to this is what is important.
>> >>
>> >> The penalty `PenaltyBreakAssignment` talks about the penalty for
>> >> breaking an assignment operator on to the next line. In our project, we
>> >> are okay with this, so giving a value of 5, which is below the value for
>> >> 'PenaltyExcessCharacter' ensures that in the end, even 1 character over
>> >> the column limit is not worth keeping an assignment on the same line.
>> >>
>> >> Similarly set the penalty for breaking before the first call parameter
>> >> 'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
>> >> comments 'PenaltyBreakComment' and the penalty for breaking string
>> >> literals 'PenaltyBreakString' also to 5.
>> >>
>> >> Finally, we really care about not breaking the return type into its own
>> >> line and we really care about not breaking before an open parenthesis.
>> >> This avoids weird formatting like:
>> >>
>> >>    static const struct strbuf *
>> >>           a_really_really_large_function_name(struct strbuf resolved,
>> >>           const char *path, int flags)
>> >
>> > Is this how it'd be indented without the penalties, or would it do
>> > this, with the function name indented the same amount as the return
>> > type (which is, in C, probably going to be the 0th column most times):
>> >
>> > static const struct strbuf *
>> > a_really_really_large_function_name(struct strbuf resolved,
>> >         const char *path, int flags)
>> >
>>
>> It will be indented, so not the 0th column.
>
> I'm not getting that behavior when I try it. Is this only indented
> with your updated penalties?
>
> Currently on ef8ce8f3d4344fd3af049c17eeba5cd20d98b69f, git status is
> clean, added this line (no line breaks, just in case my email client
> makes a mess of this) to top of path.h (chosen arbitrarily):
>
> static const struct strbuf *a_really_really_large_function_name(struct
> strbuf resolved, const char *path, int flags);
>
> and ran `clang-format path.h | head -n20` and got this (where `int
> flags` is indented to align with the opening `(`, but tabs cause
> problems yet again):
>
> static const struct strbuf *
> a_really_really_large_function_name(struct strbuf resolved, const char *path,
>     int flags);
>
> `clang-format --version` shows it's a google-internal build, but it
> still respects the .clang-format file, so this shouldn't matter? I'm
> assuming it's a relatively recent (within the past 1 month) commit
> that it's based off of.
>

You're absolutely right, this is also what I get. Sorry for the
confusion, but I assumed you were talking about the third line, i.e.
`const char *path, int flags)`.

I also ran it on the CI to make it easier on the eyes (specifically with
the tabs): [1]

>>
>> >>
>> >> or
>> >>
>> >>    static const struct strbuf *a_really_really_large_function_name(
>> >>             struct strbuf resolved, const char *path, int flags)
>> >
>> > Personal opinion: I prefer this over the version that has a single
>> > argument on the first line. My preference for reading functions is:
>> >
>> > return_type func_name(arg1, arg2,
>> >                       arg3, arg4,
>> >                       arg5, arg6, ...);
>> >
>> > Or
>> >
>> > return_type func_name(
>> >         arg1, arg2, arg3, arg4,
>> >         arg5, arg6, ...);
>> >
>>
>> I'm mostly basing my changes on the current state of the 'clang-format'
>> and our code base. I'm happy to change it if everyone agrees on this :)
>
> I'm wondering if tabs have caused some confusion here... [thought
> continued below]
>
>>
>> > or, in some cases, putting every argument on their own line (typically
>> > when the majority of the arguments are already on their own line, not
>> > having one "hiding" somewhere is preferable, but at this point if
>> > that's not what my formatter does, I don't fight it).
>> >
>> > For functions that accept an obvious first parameter, such as
>> > `strbuf_add`, maybe having the first parameter on the first line is
>> > acceptable/desirable, since it's "obvious" what it is/does. But for
>> > many functions that's not the case, and needing to read the end of the
>> > first line, potentially beyond the 80th column, feels weird.
>> >
>> >>
>> >> to instead have something more readable like:
>> >>
>> >>    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
>> >>           const char *path, int flags)
>
> Does this have `const char *path` aligned with the opening `(`? if so,
> then that matches my first example and I'm fine with it.

Yes, it is. Here is a CI job to demonstrate. I hope this makes it
clearer: [2]


> If this is
> truly "first argument immediately after (, other arguments indented
> one indentation level", then my original comment stands: I don't find
> this readable at all, and I don't see evidence of this being an
> acceptable format according to our CodingGuidelines document.
>
> I also don't understand how the penalties would produce t
>

The penalties define when the linebreak should happen. The alignment is
handled by the `AlignAfterOpenBracket: Align` rule we have in
'.clang-format'.

>> >>
>> >> This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
>> >> and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
>> >> few characters above the 80 column limit to make code more readable.
>> >
>> > A few examples, such as by formatting the code using the current rules
>> > (since much of the codebase does not currently comply), and then
>> > changing the penalties and seeing what changes, might be nice?
>> >
>>
>> You mean apart from the example above?
>
> The example above is hypothetical, but I think I was thinking about
> this incorrectly. Our existing codebase isn't formatted with
> clang-format, and formatting it first and then adjusting the penalties
> doesn't really provide much useful information.
>
> Setting the penalties as you have them in this patch, and running it
> on a copy of the line you have there, produces this for me:
>
> static const struct strbuf *
> a_really_really_large_function_name(struct strbuf resolved, const char *path,
>                                     int flags);
>
> The 80 column limit is still _strictly_ adhered to, it won't even go
> over by 1 character:
>
> static const struct strbuf *
> a_really_really_large_function_named_Ed(struct strbuf resolved,
>                                         const char *path, int flags);
>
> (note: switched tabs to spaces because tabs are difficult to use
> during discussions like this)
>
> Specifically:
> - 80 column hard limit applied
> - return type on its own line
> - continuation arguments are aligned on the next line (which is
> expected since we have AlignAfterOpenBracket set).
>

But this is not what I'm seeing though, even the CI [2] confirms that.
I'm seeing

static const struct strbuf *a_really_really_large_function_name(struct
strbuf resolved,
                                                                const
char *path,
                                                                int flags);

where the consecutive lines are aligned on '('.

(note: switched tabs to spaces too)

Thanks for the back and forth though, I guess I have some feedback for
the next version:
- Add some examples in a file like Toon suggested, showing how the
clang-format would work.
- Clarify the commit message to make it clearer about how the penalties
work with other rules.

[snip]

[1]: https://gitlab.com/gitlab-org/git/-/jobs/8105793089
[2]: https://gitlab.com/gitlab-org/git/-/jobs/8105737945

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner
  2024-10-09 12:51 [PATCH 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
                   ` (4 preceding siblings ...)
  2024-10-12  1:49 ` [PATCH v3 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
@ 2024-10-18  8:46 ` Karthik Nayak
  2024-10-18  8:46   ` [PATCH v4 1/2] clang-format: re-adjust line break penalties Karthik Nayak
                     ` (2 more replies)
  5 siblings, 3 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-18  8:46 UTC (permalink / raw)
  To: karthik.188; +Cc: git, jltobler, toon, spectral

The clang-format CI job is currently cluttered due to too many errors being
reported. See some of the examples here:

* https://gitlab.com/gitlab-org/git/-/jobs/7854601948
* https://gitlab.com/gitlab-org/git/-/jobs/7843131109

So modify the clang-format with the following changes:
1. Modify the penalties for linebreaks to be more considerate towards
readability. The commit goes into detail explaining how/why.  
2. Align the macro definitions. This is something we follow to keep the macros
readable. 

I will still keep monitoring the jobs from time to time to ensure we can fine
tune more as needed, if someone see's something odd, do keep me in the loop.

Thanks

Changes over the previous version:
1. I made the example in the first commit message a bit clearer so it is easier
to understand.
2. Removed the third commit, since I was convinced that it is good as-is for now. 

Karthik Nayak (2):
  clang-format: re-adjust line break penalties
  clang-format: align consecutive macro definitions

 .clang-format | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

Range-diff against v3:
1:  74bbd2f9db ! 1:  a8c8df5d95 clang-format: re-adjust line break penalties
    @@ Commit message
         to instead have something more readable like:
     
            static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
    -              const char *path, int flags)
    +                                                                       const char *path,
    +                                                                       int flags)
    +
    +    (note: the tabs here have been replaced by spaces for easier reading)
     
         This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
         and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
2:  1586d53769 = 2:  fcf965ac74 clang-format: align consecutive macro definitions
3:  36a53299c1 < -:  ---------- clang-format: don't align expressions after linebreaks
-- 
2.47.0


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v4 1/2] clang-format: re-adjust line break penalties
  2024-10-18  8:46 ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Karthik Nayak
@ 2024-10-18  8:46   ` Karthik Nayak
  2024-10-25  2:37     ` Justin Tobler
  2024-10-18  8:46   ` [PATCH v4 2/2] clang-format: align consecutive macro definitions Karthik Nayak
  2024-10-18 21:37   ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Taylor Blau
  2 siblings, 1 reply; 41+ messages in thread
From: Karthik Nayak @ 2024-10-18  8:46 UTC (permalink / raw)
  To: karthik.188; +Cc: git, jltobler, toon, spectral

In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
adjusted the line break penalties to really fine tune what we care about
while doing line breaks. Modify some of those to be more inline with
what we care about in the Git project now.

We need to understand that the values set to penalties in
'.clang-format' are relative to each other and do not hold any absolute
value. The penalty arguments take an 'Unsigned' value, so we have some
liberty over the values we can set.

First, in that commit, we decided, that under no circumstances do we
want to exceed 80 characters. This seems a bit too strict. We do
overshoot this limit from time to time to prioritize readability. So
let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
that we add a penalty of 10 for each character that exceeds the column
limit. By itself this is enough to restrict to column limit. Tuning
other penalties in relation to this is what is important.

The penalty `PenaltyBreakAssignment` talks about the penalty for
breaking an assignment operator on to the next line. In our project, we
are okay with this, so giving a value of 5, which is below the value for
'PenaltyExcessCharacter' ensures that in the end, even 1 character over
the column limit is not worth keeping an assignment on the same line.

Similarly set the penalty for breaking before the first call parameter
'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
comments 'PenaltyBreakComment' and the penalty for breaking string
literals 'PenaltyBreakString' also to 5.

Finally, we really care about not breaking the return type into its own
line and we really care about not breaking before an open parenthesis.
This avoids weird formatting like:

   static const struct strbuf *
          a_really_really_large_function_name(struct strbuf resolved,
          const char *path, int flags)

or

   static const struct strbuf *a_really_really_large_function_name(
   	    struct strbuf resolved, const char *path, int flags)

to instead have something more readable like:

   static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
                                                                   const char *path,
                                                                   int flags)

(note: the tabs here have been replaced by spaces for easier reading)

This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
few characters above the 80 column limit to make code more readable.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/.clang-format b/.clang-format
index 41969eca4b..66a2360ae5 100644
--- a/.clang-format
+++ b/.clang-format
@@ -209,13 +209,14 @@ KeepEmptyLinesAtTheStartOfBlocks: false

 # Penalties
 # This decides what order things should be done if a line is too long
-PenaltyBreakAssignment: 10
-PenaltyBreakBeforeFirstCallParameter: 30
-PenaltyBreakComment: 10
+PenaltyBreakAssignment: 5
+PenaltyBreakBeforeFirstCallParameter: 5
+PenaltyBreakComment: 5
 PenaltyBreakFirstLessLess: 0
-PenaltyBreakString: 10
-PenaltyExcessCharacter: 100
-PenaltyReturnTypeOnItsOwnLine: 60
+PenaltyBreakOpenParenthesis: 300
+PenaltyBreakString: 5
+PenaltyExcessCharacter: 10
+PenaltyReturnTypeOnItsOwnLine: 300

 # Don't sort #include's
 SortIncludes: false
-- 
2.47.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v4 2/2] clang-format: align consecutive macro definitions
  2024-10-18  8:46 ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-18  8:46   ` [PATCH v4 1/2] clang-format: re-adjust line break penalties Karthik Nayak
@ 2024-10-18  8:46   ` Karthik Nayak
  2024-10-18 21:37   ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Taylor Blau
  2 siblings, 0 replies; 41+ messages in thread
From: Karthik Nayak @ 2024-10-18  8:46 UTC (permalink / raw)
  To: karthik.188; +Cc: git, jltobler, toon, spectral

We generally align consecutive macro definitions for better readability:

  #define OUTPUT_ANNOTATE_COMPAT      (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME     (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP        (1U<<2)
  #define OUTPUT_PORCELAIN            (1U<<3)

over

  #define OUTPUT_ANNOTATE_COMPAT (1U<<0)
  #define OUTPUT_LONG_OBJECT_NAME (1U<<1)
  #define OUTPUT_RAW_TIMESTAMP (1U<<2)
  #define OUTPUT_PORCELAIN (1U<<3)

So let's add the rule in clang-format to follow this.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 .clang-format | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.clang-format b/.clang-format
index 66a2360ae5..9547fe1b77 100644
--- a/.clang-format
+++ b/.clang-format
@@ -32,6 +32,9 @@ AlignConsecutiveAssignments: false
 # double b = 3.14;
 AlignConsecutiveDeclarations: false
 
+# Align consecutive macro definitions.
+AlignConsecutiveMacros: true
+
 # Align escaped newlines as far left as possible
 # #define A   \
 #   int aaaa; \
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner
  2024-10-18  8:46 ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Karthik Nayak
  2024-10-18  8:46   ` [PATCH v4 1/2] clang-format: re-adjust line break penalties Karthik Nayak
  2024-10-18  8:46   ` [PATCH v4 2/2] clang-format: align consecutive macro definitions Karthik Nayak
@ 2024-10-18 21:37   ` Taylor Blau
  2024-10-20 11:17     ` karthik nayak
  2 siblings, 1 reply; 41+ messages in thread
From: Taylor Blau @ 2024-10-18 21:37 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, jltobler, toon, spectral

On Fri, Oct 18, 2024 at 10:46:44AM +0200, Karthik Nayak wrote:
> Karthik Nayak (2):
>   clang-format: re-adjust line break penalties
>   clang-format: align consecutive macro definitions
>
>  .clang-format | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)

Thanks, will queue. Are we ready to start merging this one down?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner
  2024-10-18 21:37   ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Taylor Blau
@ 2024-10-20 11:17     ` karthik nayak
  2024-10-21 21:48       ` Taylor Blau
  0 siblings, 1 reply; 41+ messages in thread
From: karthik nayak @ 2024-10-20 11:17 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jltobler, toon, spectral

[-- Attachment #1: Type: text/plain, Size: 448 bytes --]

Taylor Blau <me@ttaylorr.com> writes:

> On Fri, Oct 18, 2024 at 10:46:44AM +0200, Karthik Nayak wrote:
>> Karthik Nayak (2):
>>   clang-format: re-adjust line break penalties
>>   clang-format: align consecutive macro definitions
>>
>>  .clang-format | 16 ++++++++++------
>>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> Thanks, will queue. Are we ready to start merging this one down?
>

I'd wait for some reviews :)

> Thanks,
> Taylor

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner
  2024-10-20 11:17     ` karthik nayak
@ 2024-10-21 21:48       ` Taylor Blau
  2024-10-22  8:31         ` karthik nayak
  0 siblings, 1 reply; 41+ messages in thread
From: Taylor Blau @ 2024-10-21 21:48 UTC (permalink / raw)
  To: karthik nayak; +Cc: git, jltobler, toon, spectral

On Sun, Oct 20, 2024 at 06:17:58AM -0500, karthik nayak wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > On Fri, Oct 18, 2024 at 10:46:44AM +0200, Karthik Nayak wrote:
> >> Karthik Nayak (2):
> >>   clang-format: re-adjust line break penalties
> >>   clang-format: align consecutive macro definitions
> >>
> >>  .clang-format | 16 ++++++++++------
> >>  1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > Thanks, will queue. Are we ready to start merging this one down?
>
> I'd wait for some reviews :)

OK. My impression was that the dust had more or less settled from
earlier rounds. But let's wait.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner
  2024-10-21 21:48       ` Taylor Blau
@ 2024-10-22  8:31         ` karthik nayak
  2024-10-22 16:42           ` Taylor Blau
  0 siblings, 1 reply; 41+ messages in thread
From: karthik nayak @ 2024-10-22  8:31 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jltobler, toon, spectral

[-- Attachment #1: Type: text/plain, Size: 798 bytes --]

Taylor Blau <me@ttaylorr.com> writes:

> On Sun, Oct 20, 2024 at 06:17:58AM -0500, karthik nayak wrote:
>> Taylor Blau <me@ttaylorr.com> writes:
>>
>> > On Fri, Oct 18, 2024 at 10:46:44AM +0200, Karthik Nayak wrote:
>> >> Karthik Nayak (2):
>> >>   clang-format: re-adjust line break penalties
>> >>   clang-format: align consecutive macro definitions
>> >>
>> >>  .clang-format | 16 ++++++++++------
>> >>  1 file changed, 10 insertions(+), 6 deletions(-)
>> >
>> > Thanks, will queue. Are we ready to start merging this one down?
>>
>> I'd wait for some reviews :)
>
> OK. My impression was that the dust had more or less settled from
> earlier rounds. But let's wait.
>
> Thanks,
> Taylor

I'd be happy if it merged down, I'll see if someone from GitLab can help
with a review.

Thanks,
Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner
  2024-10-22  8:31         ` karthik nayak
@ 2024-10-22 16:42           ` Taylor Blau
  0 siblings, 0 replies; 41+ messages in thread
From: Taylor Blau @ 2024-10-22 16:42 UTC (permalink / raw)
  To: karthik nayak; +Cc: git, jltobler, toon, spectral

On Tue, Oct 22, 2024 at 04:31:37AM -0400, karthik nayak wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > On Sun, Oct 20, 2024 at 06:17:58AM -0500, karthik nayak wrote:
> >> Taylor Blau <me@ttaylorr.com> writes:
> >>
> >> > On Fri, Oct 18, 2024 at 10:46:44AM +0200, Karthik Nayak wrote:
> >> >> Karthik Nayak (2):
> >> >>   clang-format: re-adjust line break penalties
> >> >>   clang-format: align consecutive macro definitions
> >> >>
> >> >>  .clang-format | 16 ++++++++++------
> >> >>  1 file changed, 10 insertions(+), 6 deletions(-)
> >> >
> >> > Thanks, will queue. Are we ready to start merging this one down?
> >>
> >> I'd wait for some reviews :)
> >
> > OK. My impression was that the dust had more or less settled from
> > earlier rounds. But let's wait.
> >
> > Thanks,
> > Taylor
>
> I'd be happy if it merged down, I'll see if someone from GitLab can help
> with a review.

Having additional reviewer eyes is much appreciated. Let's err on the
side of that rather than rushing a topic if you don't feel that there is
consensus yet.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 1/2] clang-format: re-adjust line break penalties
  2024-10-18  8:46   ` [PATCH v4 1/2] clang-format: re-adjust line break penalties Karthik Nayak
@ 2024-10-25  2:37     ` Justin Tobler
  2024-10-25  9:48       ` karthik nayak
  0 siblings, 1 reply; 41+ messages in thread
From: Justin Tobler @ 2024-10-25  2:37 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git, toon, spectral

On 24/10/18 10:46AM, Karthik Nayak wrote:
> In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
> adjusted the line break penalties to really fine tune what we care about
> while doing line breaks. Modify some of those to be more inline with
> what we care about in the Git project now.

From my understanding, the original motivation for these changes was to
cut down on the noise from the clang-format CI job. These changes seem
reasonable for that purpose, but affect the also formatter in general.

Out of curiousity, would it be possible to just configured clang-format
for the CI job to behave in this manner? Ultimately, I'm not sure that
would be good idea though because having a diverged set of rules may
just cause more noise.

-Justin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v4 1/2] clang-format: re-adjust line break penalties
  2024-10-25  2:37     ` Justin Tobler
@ 2024-10-25  9:48       ` karthik nayak
  0 siblings, 0 replies; 41+ messages in thread
From: karthik nayak @ 2024-10-25  9:48 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, toon, spectral

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

Justin Tobler <jltobler@gmail.com> writes:

> On 24/10/18 10:46AM, Karthik Nayak wrote:
>> In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
>> adjusted the line break penalties to really fine tune what we care about
>> while doing line breaks. Modify some of those to be more inline with
>> what we care about in the Git project now.
>
> From my understanding, the original motivation for these changes was to
> cut down on the noise from the clang-format CI job. These changes seem
> reasonable for that purpose, but affect the also formatter in general.
>

Yes, you're right. Which is the intended affect.

> Out of curiousity, would it be possible to just configured clang-format
> for the CI job to behave in this manner? Ultimately, I'm not sure that
> would be good idea though because having a diverged set of rules may
> just cause more noise.
>

We do that in 'ci/run-style-check.sh' already, but here I'd say there is
no need to diverge. We do want users to apply clang-format to _their_
changes, which should be similar to what the CI barfs.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2024-10-25  9:48 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-09 12:51 [PATCH 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
2024-10-09 12:55 ` [PATCH 1/3] clang-format: don't enforce the column limit Karthik Nayak
2024-10-09 15:45   ` Justin Tobler
2024-10-09 22:32     ` Junio C Hamano
2024-10-10 16:48       ` karthik nayak
2024-10-09 12:56 ` [PATCH 2/3] clang-format: don't align expressions after linebreaks Karthik Nayak
2024-10-09 12:56 ` [PATCH 3/3] clang-format: align consecutive macro definitions Karthik Nayak
2024-10-10  8:27   ` Toon Claes
2024-10-10 17:59 ` [PATCH v2 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
2024-10-10 17:59   ` [PATCH v2 1/3] clang-format: change column limit to 96 characters Karthik Nayak
2024-10-10 18:11     ` Kyle Lippincott
2024-10-10 19:49       ` karthik nayak
2024-10-10 20:09         ` Kyle Lippincott
2024-10-10 17:59   ` [PATCH v2 2/3] clang-format: don't align expressions after linebreaks Karthik Nayak
2024-10-10 17:59   ` [PATCH v2 3/3] clang-format: align consecutive macro definitions Karthik Nayak
2024-10-12  1:49 ` [PATCH v3 0/3] clang-format: fix rules to make the CI job cleaner Karthik Nayak
2024-10-12  1:49   ` [PATCH v3 1/3] clang-format: re-adjust line break penalties Karthik Nayak
2024-10-14  9:08     ` Toon Claes
2024-10-14 21:14       ` Taylor Blau
2024-10-15 11:35         ` karthik nayak
2024-10-15 11:20       ` karthik nayak
2024-10-14 20:59     ` Kyle Lippincott
2024-10-15 12:37       ` karthik nayak
2024-10-16  1:38         ` Kyle Lippincott
2024-10-16 21:17           ` karthik nayak
2024-10-12  1:49   ` [PATCH v3 2/3] clang-format: align consecutive macro definitions Karthik Nayak
2024-10-14 21:12     ` Kyle Lippincott
2024-10-15  7:57       ` karthik nayak
2024-10-12  1:49   ` [PATCH v3 3/3] clang-format: don't align expressions after linebreaks Karthik Nayak
2024-10-14 21:23     ` Kyle Lippincott
2024-10-15 11:17       ` karthik nayak
2024-10-18  8:46 ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Karthik Nayak
2024-10-18  8:46   ` [PATCH v4 1/2] clang-format: re-adjust line break penalties Karthik Nayak
2024-10-25  2:37     ` Justin Tobler
2024-10-25  9:48       ` karthik nayak
2024-10-18  8:46   ` [PATCH v4 2/2] clang-format: align consecutive macro definitions Karthik Nayak
2024-10-18 21:37   ` [PATCH v4 0/2] Subject: clang-format: fix rules to make the CI job cleaner Taylor Blau
2024-10-20 11:17     ` karthik nayak
2024-10-21 21:48       ` Taylor Blau
2024-10-22  8:31         ` karthik nayak
2024-10-22 16:42           ` Taylor Blau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).