git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] A couple of CI improvements
@ 2025-01-03 14:46 Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
                   ` (13 more replies)
  0 siblings, 14 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

Hi,

this patch series addresses a couple of issues I've found while
investigating flaky CI jobs. Besides two more fixes for flaky jobs it
also removes some stale code and simplifies the setup on GitHub Actions
to always use containerized jobs on Linux.

Test runs can be found for GitLab [1] and GitHub [2].

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/277
[2]: https://github.com/git/git/pull/1865

---
Patrick Steinhardt (10):
      t0060: fix EBUSY in MinGW when setting up runtime prefix
      t7422: fix flaky test caused by buffered stdout
      github: adapt containerized jobs to be rootless
      github: convert all Linux jobs to be containerized
      github: simplify computation of the job's distro
      gitlab-ci: remove the "linux-old" job
      gitlab-ci: add linux32 job testing against i386
      ci: stop special-casing for Ubuntu 16.04
      ci: use latest Ubuntu release
      ci: remove stale code for Azure Pipelines

 .github/workflows/main.yml  | 78 ++++++++++++++++++++++-----------------------
 .gitlab-ci.yml              | 19 ++++++-----
 ci/install-dependencies.sh  |  6 ++--
 ci/lib.sh                   | 34 +++-----------------
 ci/print-test-failures.sh   |  5 ---
 t/t0060-path-utils.sh       | 10 +++---
 t/t7422-submodule-output.sh | 10 ++++--
 7 files changed, 68 insertions(+), 94 deletions(-)


---
base-commit: 1b4e9a5f8b5f048972c21fe8acafe0404096f694
change-id: 20250103-b4-pks-ci-fixes-2d0a23fb5c78


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

Two of our tests in t0060 verify that the runtime prefix functionality
works as expected by creating a separate directory hierarchy, copying
the Git executable in there and then creating scripts relative to that
executable.

These tests fail quite regularly in GitLab CI with the following error:

    expecting success of 0060.218 '%(prefix)/ works':
            mkdir -p pretend/bin &&
            cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
            git config yes.path "%(prefix)/yes" &&
            GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
            echo "$(pwd)/pretend/yes" >expect &&
            test_cmp expect actual
    ++ mkdir -p pretend/bin
    ++ cp /c/GitLab-Runner/builds/gitlab-org/git/git.exe pretend/bin/
    cp: cannot create regular file 'pretend/bin/git.exe': Device or resource busy
    error: last command exited with $?=1
    not ok 218 - %(prefix)/ works

Seemingly, the "git.exe" binary we are trying to overwrite is still
being held open. It is somewhat puzzling why exactly that is: while the
preceding test _does_ write to and execute the same path, it should have
exited and shouldn't keep any backgrounded processes around. So it must
be held open by something else, either in MinGW or in Windows itself.

While the root cause is puzzling, the workaround is trivial enough:
instead of writing the file twice we simply pull the common setup into a
separate test case so that we won't observe EBUSY in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t0060-path-utils.sh | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index dbb2e73bcd912ae6a804603ff54e4c609966fa5d..8545cdfab559b4e247cb2699965e637529fd930a 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -592,17 +592,19 @@ test_lazy_prereq CAN_EXEC_IN_PWD '
 	./git rev-parse
 '
 
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'setup runtime prefix' '
+	mkdir -p pretend/bin &&
+	cp "$GIT_EXEC_PATH"/git$X pretend/bin/
+'
+
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
-	mkdir -p pretend/bin pretend/libexec/git-core &&
+	mkdir -p pretend/libexec/git-core &&
 	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	GIT_EXEC_PATH= ./pretend/bin/git here >actual &&
 	echo HERE >expect &&
 	test_cmp expect actual'
 
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
-	mkdir -p pretend/bin &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	git config yes.path "%(prefix)/yes" &&
 	GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
 	echo "$(pwd)/pretend/yes" >expect &&

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 18:17   ` Jeff King
  2025-01-03 14:46 ` [PATCH 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by us using grep(1) to terminate the pipe on the
first matching line in the recursing git-submodule(1) process. Standard
streams are typically buffered though, so this condition is racy and may
cause us to terminate the pipe after git-submodule(1) has already
exited, and in that case we wouldn't see the expected signal.

Fix the issue by converting standard streams to be unbuffered. I have
only been able to reproduce this issue a single time after running t7422
with `--stress` after an extended amount of time, so I cannot claim to
be fully certain that this fix is sufficient.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7422-submodule-output.sh | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e9203678b94701281d5339ae8bfe53d5de0ed..ba843c02c9c2da198578aec5716813de32960b86 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -166,9 +166,13 @@ do
 	'
 done
 
-test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
+test_lazy_prereq STDBUF '
+	stdbuf --version
+'
+
+test_expect_success !MINGW,STDBUF 'git submodule status --recursive propagates SIGPIPE' '
+	{ stdbuf -oL git submodule status --recursive 2>err; echo $?>status; } |
+		stdbuf -i0 grep -q X/S &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 03/10] github: adapt containerized jobs to be rootless
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

The containerized jobs in GitHub Actions run as root, giving them
special permissions to for example delete files even when the user
shouldn't be able to due to file permissions. This limitation keeps us
from using containerized jobs for most of our Ubuntu-based jobs as it
causes a number of tests to fail.

Adapt the jobs to create a separate user that executes the test suite.
This follows similar infrastructure that we already have in GitLab CI.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 6 ++++--
 ci/install-dependencies.sh | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 900be9957a23fcaa64e1aefd0c8638c5f84b7997..b02f5873a540b458d38e7951b4ee3d5ca598ae23 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -371,10 +371,12 @@ jobs:
       run: apt -q update && apt -q -y install libc6-amd64 lib64stdc++6
     - uses: actions/checkout@v4
     - run: ci/install-dependencies.sh
-    - run: ci/run-build-and-tests.sh
+    - run: useradd builder --create-home
+    - run: chown -R builder .
+    - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
-      run: ci/print-test-failures.sh
+      run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v4
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index d1cb9fa8785388b3674fcea4dd682abc0725c968..ecb5b9d36c20d3e7e96148ac628a96c62642c308 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -31,7 +31,7 @@ alpine-*)
 	;;
 fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
-	dnf -yq install make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
 ubuntu-*|ubuntu32-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 04/10] github: convert all Linux jobs to be containerized
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 18:56   ` Jeff King
  2025-01-03 19:16   ` Junio C Hamano
  2025-01-03 14:46 ` [PATCH 05/10] github: simplify computation of the job's distro Patrick Steinhardt
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

We have split the CI jobs in GitHub Workflows into two categories:

  - Those running on a machine pool directly.

  - Those running in a container on the machine pool.

The latter is more flexible because it allows us to freely pick whatever
container image we want to use for a specific job, while the former only
allows us to pick from a handful of different distros. The containerized
jobs shouldn't cause a significant slowdown, either, so they do not have
any significant upside to the best of my knowlegde. The only upside that
they did have before the preceding commit is that they run as a non-root
user, but that has been addressed now.

Convert all Linux jobs to be containerized for additional flexibility.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 68 ++++++++++++++++++++++++++--------------------
 1 file changed, 39 insertions(+), 29 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b02f5873a540b458d38e7951b4ee3d5ca598ae23..8e5847da4fab009ad699c18e1a5a336a8b45c3ed 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -259,20 +259,6 @@ jobs:
       fail-fast: false
       matrix:
         vector:
-          - jobname: linux-sha256
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-reftable
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-gcc
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
-          - jobname: linux-TEST-vars
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
           - jobname: osx-clang
             cc: clang
             pool: macos-13
@@ -285,21 +271,6 @@ jobs:
           - jobname: osx-meson
             cc: clang
             pool: macos-13
-          - jobname: linux-gcc-default
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-reftable-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-asan-ubsan
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-meson
-            cc: gcc
-            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
@@ -342,6 +313,44 @@ jobs:
       fail-fast: false
       matrix:
         vector:
+        - jobname: linux-sha256
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-reftable
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-gcc
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-TEST-vars
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-gcc-default
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-reftable-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-asan-ubsan
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-meson
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
         - jobname: linux-musl
           image: alpine
           distro: alpine-latest
@@ -363,6 +372,7 @@ jobs:
     env:
       jobname: ${{matrix.vector.jobname}}
       distro: ${{matrix.vector.distro}}
+      CC: ${{matrix.vector.cc}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 05/10] github: simplify computation of the job's distro
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 19:09   ` Junio C Hamano
  2025-01-03 14:46 ` [PATCH 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

We explicitly list the distro of Linux-based jobs, but it is equivalent
to the name of the image in almost all cases, except that colons are
replaced with dashes. Drop the redundant information and massage it in
our CI scripts, which is equivalent to how we do it in GitLab CI.

There are a couple of exceptions:

  - The "linux32" job, w whose distro name is different than the image
    name. This is handled by adapting all sites to use the new name.

  - The "alpine" and "fedora" jobs, neither of which specify a tag for
    their image. This is handled by adding the "latest" tag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 22 ++++------------------
 ci/install-dependencies.sh |  4 ++--
 ci/lib.sh                  |  2 ++
 3 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 8e5847da4fab009ad699c18e1a5a336a8b45c3ed..b54da639a650682495994e3c7b137eab4e6cb3bf 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -275,7 +275,7 @@ jobs:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.pool}}
+      CI_JOB_IMAGE: ${{matrix.vector.pool}}
       TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t
     runs-on: ${{matrix.vector.pool}}
     steps:
@@ -316,63 +316,49 @@ jobs:
         - jobname: linux-sha256
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-reftable
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-gcc
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-TEST-vars
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-gcc-default
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-reftable-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-asan-ubsan
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-meson
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-musl
-          image: alpine
-          distro: alpine-latest
+          image: alpine:latest
         # Supported until 2025-04-02.
         - jobname: linux32
           image: i386/ubuntu:focal
-          distro: ubuntu32-20.04
         - jobname: pedantic
-          image: fedora
-          distro: fedora-latest
+          image: fedora:latest
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
         - jobname: almalinux-8
           image: almalinux:8
-          distro: almalinux-8
         # Supported until 2026-08-31.
         - jobname: debian-11
           image: debian:11
-          distro: debian-11
     env:
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.distro}}
       CC: ${{matrix.vector.cc}}
+      CI_JOB_IMAGE: ${{matrix.vector.image}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index ecb5b9d36c20d3e7e96148ac628a96c62642c308..d5a959e25ff3236656ff3416b81732ec5c2107c1 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -33,7 +33,7 @@ fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
 	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
-ubuntu-*|ubuntu32-*|debian-*)
+ubuntu-*|i386/ubuntu-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.
 	export DEBIAN_FRONTEND=noninteractive
 
@@ -42,7 +42,7 @@ ubuntu-*|ubuntu32-*|debian-*)
 		SVN='libsvn-perl subversion'
 		LANGUAGES='language-pack-is'
 		;;
-	ubuntu32-*)
+	i386/ubuntu-*)
 		SVN=
 		LANGUAGES='language-pack-is'
 		;;
diff --git a/ci/lib.sh b/ci/lib.sh
index 8885ee3c3f86c62e8783d27756b8779bd491e7e6..f8b68ab8a6546802756fd516ca15a2c97223da5f 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -246,6 +246,8 @@ then
 
 	GIT_TEST_OPTS="--github-workflow-markup"
 	JOBS=10
+
+	distro=$(echo "$CI_JOB_IMAGE" | tr : -)
 elif test true = "$GITLAB_CI"
 then
 	CI_TYPE=gitlab-ci

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 06/10] gitlab-ci: remove the "linux-old" job
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 05/10] github: simplify computation of the job's distro Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 19:12   ` Junio C Hamano
  2025-01-03 14:46 ` [PATCH 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

The "linux-old" job was historically testing against the oldest
supported LTS release of Ubuntu. But with c85bcb5de1 (gitlab-ci: switch
from Ubuntu 16.04 to 20.04, 2024-10-31) it has been converted to test
against Ubuntu 20.04, which already gets exercised in a couple of other
CI jobs. It's thus not adding any significant test coverage.

Drop the job.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 9254e01583306e67dc12b6b9e0015183e1108655..00bc727865031620752771af4a9030c7de1b73df 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -36,9 +36,6 @@ test:linux:
       fi
   parallel:
     matrix:
-      - jobname: linux-old
-        image: ubuntu:20.04
-        CC: gcc
       - jobname: linux-sha256
         image: ubuntu:latest
         CC: clang

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 07/10] gitlab-ci: add linux32 job testing against i386
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

Add another job to GitLab CI that tests against the i386 architecture.
This job is equivalent to the same job in GitHub Workflows.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 2 ++
 ci/lib.sh      | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 00bc727865031620752771af4a9030c7de1b73df..29e9056dd5010f8843e42aeae8410973c825de54 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -66,6 +66,8 @@ test:linux:
         image: fedora:latest
       - jobname: linux-musl
         image: alpine:latest
+      - jobname: linux32
+        image: i386/ubuntu:20.04
       - jobname: linux-meson
         image: ubuntu:latest
         CC: gcc
diff --git a/ci/lib.sh b/ci/lib.sh
index f8b68ab8a6546802756fd516ca15a2c97223da5f..2293849ada3b45873f80e4392ab93c65657d0f13 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -269,7 +269,7 @@ then
 		CI_OS_NAME=osx
 		JOBS=$(nproc)
 		;;
-	*,alpine:*|*,fedora:*|*,ubuntu:*)
+	*,alpine:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
 		CI_OS_NAME=linux
 		JOBS=$(nproc)
 		;;

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 08/10] ci: stop special-casing for Ubuntu 16.04
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (6 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 09/10] ci: use latest Ubuntu release Patrick Steinhardt
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

With c85bcb5de1 (gitlab-ci: switch from Ubuntu 16.04 to 20.04,
2024-10-31) we have adapted the last CI job to stop using Ubuntu 16.04
in favor of Ubuntu 20.04. Remove the special-casing we still have in our
CI scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 2293849ada3b45873f80e4392ab93c65657d0f13..77a4aabdb8fb416c1733f02d02145b6bc0849998 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -347,14 +347,7 @@ ubuntu-*)
 	fi
 	MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/$PYTHON_PACKAGE"
 
-	case "$distro" in
-	ubuntu-16.04)
-		# Apache is too old for HTTP/2.
-		;;
-	*)
-		export GIT_TEST_HTTPD=true
-		;;
-	esac
+	export GIT_TEST_HTTPD=true
 
 	# The Linux build installs the defined dependency versions below.
 	# The OS X build installs much more recent versions, whichever

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 09/10] ci: use latest Ubuntu release
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (7 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 14:46 ` [PATCH 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

Both GitHub Actions and GitLab CI use the "ubuntu:latest" tag as the
default image for most jobs. This tag is somewhat misleading though, as
it does not refer to the latest release of Ubuntu, but to the latest LTS
release thereof. But as we already have a couple of jobs exercising the
oldest LTS release of Ubuntu that Git still supports, it would make more
sense to test the oldest and youngest versions of Ubuntu.

Adapt these jobs to instead use the "ubuntu:rolling" tag, which refers
to the actual latest release, which currently is Ubuntu 24.10.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 14 +++++++-------
 .gitlab-ci.yml             | 14 +++++++-------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b54da639a650682495994e3c7b137eab4e6cb3bf..b90381ae015edf9db5aa4b8c0ace9bb5c549c37b 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -314,10 +314,10 @@ jobs:
       matrix:
         vector:
         - jobname: linux-sha256
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-reftable
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-gcc
           image: ubuntu:20.04
@@ -328,19 +328,19 @@ jobs:
           cc: gcc
           cc_package: gcc-8
         - jobname: linux-gcc-default
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-reftable-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-asan-ubsan
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-meson
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-musl
           image: alpine:latest
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 29e9056dd5010f8843e42aeae8410973c825de54..8ed3ff5f0373d70b6f609dc5292dda2dd7fd8f88 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -37,10 +37,10 @@ test:linux:
   parallel:
     matrix:
       - jobname: linux-sha256
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-reftable
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-gcc
         image: ubuntu:20.04
@@ -51,16 +51,16 @@ test:linux:
         CC: gcc
         CC_PACKAGE: gcc-8
       - jobname: linux-gcc-default
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-reftable-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-asan-ubsan
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: pedantic
         image: fedora:latest
@@ -69,7 +69,7 @@ test:linux:
       - jobname: linux32
         image: i386/ubuntu:20.04
       - jobname: linux-meson
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
   artifacts:
     paths:

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 10/10] ci: remove stale code for Azure Pipelines
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (8 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 09/10] ci: use latest Ubuntu release Patrick Steinhardt
@ 2025-01-03 14:46 ` Patrick Steinhardt
  2025-01-03 18:57 ` [PATCH 00/10] A couple of CI improvements Jeff King
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 14:46 UTC (permalink / raw)
  To: git

Support for Azure Pipelines has been retired in 6081d3898f (ci: retire
the Azure Pipelines definition, 2020-04-11) in favor of GitHub Actions.
Our CI library still has some infrastructure left for Azure though that
is now unused. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh                 | 21 +--------------------
 ci/print-test-failures.sh |  5 -----
 2 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 77a4aabdb8fb416c1733f02d02145b6bc0849998..4003354f16c048b969c0bb4340d2ee2777767300 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -206,26 +206,7 @@ export TERM=${TERM:-dumb}
 # Clear MAKEFLAGS that may come from the outside world.
 export MAKEFLAGS=
 
-if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
-then
-	CI_TYPE=azure-pipelines
-	# We are running in Azure Pipelines
-	CI_BRANCH="$BUILD_SOURCEBRANCH"
-	CI_COMMIT="$BUILD_SOURCEVERSION"
-	CI_JOB_ID="$BUILD_BUILDID"
-	CI_JOB_NUMBER="$BUILD_BUILDNUMBER"
-	CI_OS_NAME="$(echo "$AGENT_OS" | tr A-Z a-z)"
-	test darwin != "$CI_OS_NAME" || CI_OS_NAME=osx
-	CI_REPO_SLUG="$(expr "$BUILD_REPOSITORY_URI" : '.*/\([^/]*/[^/]*\)$')"
-	CC="${CC:-gcc}"
-
-	# use a subdirectory of the cache dir (because the file share is shared
-	# among *all* phases)
-	cache_dir="$HOME/test-cache/$SYSTEM_PHASENAME"
-
-	GIT_TEST_OPTS="--write-junit-xml"
-	JOBS=10
-elif test true = "$GITHUB_ACTIONS"
+if test true = "$GITHUB_ACTIONS"
 then
 	CI_TYPE=github-actions
 	CI_BRANCH="$GITHUB_REF"
diff --git a/ci/print-test-failures.sh b/ci/print-test-failures.sh
index 655687dd827e5b3e4d4879803b0d4499e7751380..dc910e51609cd7344b1ad03fdb4e820e47ad3a88 100755
--- a/ci/print-test-failures.sh
+++ b/ci/print-test-failures.sh
@@ -39,11 +39,6 @@ do
 		test_name="${test_name##*/}"
 		trash_dir="trash directory.$test_name"
 		case "$CI_TYPE" in
-		azure-pipelines)
-			mkdir -p failed-test-artifacts
-			mv "$trash_dir" failed-test-artifacts
-			continue
-			;;
 		github-actions)
 			mkdir -p failed-test-artifacts
 			echo "FAILED_TEST_ARTIFACTS=${TEST_OUTPUT_DIRECTORY:t}/failed-test-artifacts" >>$GITHUB_ENV

-- 
2.48.0.rc1.241.g6c04ab211c.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-03 14:46 ` [PATCH 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-03 18:17   ` Jeff King
  2025-01-06 11:12     ` Patrick Steinhardt
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2025-01-03 18:17 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Fri, Jan 03, 2025 at 03:46:39PM +0100, Patrick Steinhardt wrote:

> One test in t7422 asserts that `git submodule status --recursive`
> properly handles SIGPIPE. This test is flaky though and may sometimes
> not see a SIGPIPE at all:
> 
>     expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
>             { git submodule status --recursive 2>err; echo $?>status; } |
>                     grep -q X/S &&
>             test_must_be_empty err &&
>             test_match_signal 13 "$(cat status)"

I couldn't reproduce with --stress, but you can trigger it all the time
with:

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e920367..9338c75626 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -168,7 +168,7 @@ done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
 	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
+		{ sleep 1 && grep -q X/S; } &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '

The problem is that git-submodule may write all of its output before
grep exits, and it gets stored in the pipe buffer. And then even if grep
exits before reading all of it, it is too late for SIGPIPE, and the data
in the pipe is just discarded by the OS.

So this:

> The issue is caused by us using grep(1) to terminate the pipe on the
> first matching line in the recursing git-submodule(1) process. Standard
> streams are typically buffered though, so this condition is racy and may
> cause us to terminate the pipe after git-submodule(1) has already
> exited, and in that case we wouldn't see the expected signal.
> 
> Fix the issue by converting standard streams to be unbuffered. I have
> only been able to reproduce this issue a single time after running t7422
> with `--stress` after an extended amount of time, so I cannot claim to
> be fully certain that this fix is sufficient.

isn't quite right. Even without input buffering on grep's part, it may
be too slow to read the data. And adding a sleep as above shows that it
still fails with your patch.

The usual way to reliably get SIGPIPE is to make sure the writer
produces enough data to fill the pipe buffer. But it's tricky to get
"submodule status" to produce a lot of data without having a ton of
submodules, which is expensive to set up.

But we can hack around it by stuffing the pipe full with a separate
process. Like this:

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e920367..c4df2629e8 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,8 +167,15 @@ do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
+	{
+		# stuff pipe buffer full of input so that submodule status
+		# will require blocking on write; this script will write over
+		# 128kb. It might itself get SIGPIPE, so we must not &&-chain
+		# it directly.
+		{ perl -le "print q{foo} for (1..33000)" || true; } &&
+		git submodule status --recursive 2>err
+		echo $? >status
+	} | { sleep 1 && head -n 1 >/dev/null; } &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '
A few notes:

  - the sleep is still there to demonstrate that it always works, but
    obviously we'd want to remove that

  - I swapped out "grep" for "head". What we are matching is not
    relevant; the important thing is that the reader closes the pipe
    immediately. So I guess in that sense we could probably even just
    pipe to "true" or similar.

  - I tried using test_seq to avoid the inline perl, but it doesn't
    work! The problem is that it's implemented as a shell function. So
    when it gets SIGPIPE, the whole subshell is killed, and we never
    even run git-submodule at all. So it has to be a separate process
    (though I guess it could be test_seq in a subshell).

Anyway, hopefully that gives you enough to play around with.

-Peff

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/10] github: convert all Linux jobs to be containerized
  2025-01-03 14:46 ` [PATCH 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
@ 2025-01-03 18:56   ` Jeff King
  2025-01-03 19:06     ` Jeff King
  2025-01-03 19:16   ` Junio C Hamano
  1 sibling, 1 reply; 69+ messages in thread
From: Jeff King @ 2025-01-03 18:56 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Fri, Jan 03, 2025 at 03:46:41PM +0100, Patrick Steinhardt wrote:

> We have split the CI jobs in GitHub Workflows into two categories:
> 
>   - Those running on a machine pool directly.
> 
>   - Those running in a container on the machine pool.
> 
> The latter is more flexible because it allows us to freely pick whatever
> container image we want to use for a specific job, while the former only
> allows us to pick from a handful of different distros. The containerized
> jobs shouldn't cause a significant slowdown, either, so they do not have
> any significant upside to the best of my knowlegde. The only upside that
> they did have before the preceding commit is that they run as a non-root
> user, but that has been addressed now.

I remember running into a few issues recently with containerized jobs,
so I dug in the archive a bit. The issue there was that the container
was not equipped to support the dynamically-linked version of node that
was being mounted into place (whereas the runner image from the CI
provider would work fine).

I guess that's probably not a big deal for us here. These are roughly
the same environments, just pulling from docker instead of relying on
the runner images. It's possible that Actions scripts might depend on
something special in the runner image, but in practice I think they try
to keep the dependencies pretty light.

So we're probably OK to proceed here, and deal with any problems in the
unlikely event that they come up.

I do wonder if it will affect run times. Presumably GitHub has made it
pretty fast to get things started on the bare runner image. Now we're
pulling docker images. That is hopefully pretty optimized and cached,
but it is extra work. Might be worth measuring.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 00/10] A couple of CI improvements
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (9 preceding siblings ...)
  2025-01-03 14:46 ` [PATCH 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
@ 2025-01-03 18:57 ` Jeff King
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2025-01-03 18:57 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Fri, Jan 03, 2025 at 03:46:37PM +0100, Patrick Steinhardt wrote:

> this patch series addresses a couple of issues I've found while
> investigating flaky CI jobs. Besides two more fixes for flaky jobs it
> also removes some stale code and simplifies the setup on GitHub Actions
> to always use containerized jobs on Linux.

I left comments on two patches, but the rest seemed fine to me (and I am
very happy to see cleanup of old/stale code).

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/10] github: convert all Linux jobs to be containerized
  2025-01-03 18:56   ` Jeff King
@ 2025-01-03 19:06     ` Jeff King
  2025-01-06 11:12       ` Patrick Steinhardt
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2025-01-03 19:06 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Fri, Jan 03, 2025 at 01:56:40PM -0500, Jeff King wrote:

> I do wonder if it will affect run times. Presumably GitHub has made it
> pretty fast to get things started on the bare runner image. Now we're
> pulling docker images. That is hopefully pretty optimized and cached,
> but it is extra work. Might be worth measuring.

Just peeking at your CI run here:

  https://github.com/git/git/actions/runs/12597967146

versus the latest run on Junio's master:

  https://github.com/git/git/actions/runs/12589300693

I see:

  job                 |  old | new
  --------------------|------|------
  linux-TEST-vars      11m30s 10m54s
  linux-asan-ubsan     30m26s 31m14s
  linux-gcc             9m47s 10m6s
  linux-gcc-default     9m47s  9m41s
  linux-leaks          25m50s 25m21s
  linux-meson          10m36s 10m41s
  linux-reftable       10m25s 10m23s
  linux-reftable-leaks 27m18s 27m28s
  linux-sha256          9m54s 10m31s

So it looks like any change is lost in the noise (sha256 is noticeably
slower, but most jobs aren't, and some are even faster).

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 05/10] github: simplify computation of the job's distro
  2025-01-03 14:46 ` [PATCH 05/10] github: simplify computation of the job's distro Patrick Steinhardt
@ 2025-01-03 19:09   ` Junio C Hamano
  0 siblings, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-01-03 19:09 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> We explicitly list the distro of Linux-based jobs, but it is equivalent
> to the name of the image in almost all cases, except that colons are
> replaced with dashes. Drop the redundant information and massage it in
> our CI scripts, which is equivalent to how we do it in GitLab CI.
>
> There are a couple of exceptions:
>
>   - The "linux32" job, w whose distro name is different than the image
>     name. This is handled by adapting all sites to use the new name.

"w whose"???

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 06/10] gitlab-ci: remove the "linux-old" job
  2025-01-03 14:46 ` [PATCH 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
@ 2025-01-03 19:12   ` Junio C Hamano
  0 siblings, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-01-03 19:12 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> The "linux-old" job was historically testing against the oldest
> supported LTS release of Ubuntu. But with c85bcb5de1 (gitlab-ci: switch
> from Ubuntu 16.04 to 20.04, 2024-10-31) it has been converted to test
> against Ubuntu 20.04, which already gets exercised in a couple of other
> CI jobs. It's thus not adding any significant test coverage.
>
> Drop the job.

Dropping and reducing is always welcomed ;-)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/10] github: convert all Linux jobs to be containerized
  2025-01-03 14:46 ` [PATCH 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
  2025-01-03 18:56   ` Jeff King
@ 2025-01-03 19:16   ` Junio C Hamano
  1 sibling, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-01-03 19:16 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> ... The containerized
> jobs shouldn't cause a significant slowdown, either, so they do not have
> any significant upside to the best of my knowlegde.

"shouldn't" is a somewhat hand-wavy word.

"knowlegde" -> "knowledge".

Are there security implications for us to worry about?  How tightly
are these container images controlled, relative to the way forges
prepare their selected environments?

Thanks.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 04/10] github: convert all Linux jobs to be containerized
  2025-01-03 19:06     ` Jeff King
@ 2025-01-06 11:12       ` Patrick Steinhardt
  0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:12 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jan 03, 2025 at 02:06:59PM -0500, Jeff King wrote:
> On Fri, Jan 03, 2025 at 01:56:40PM -0500, Jeff King wrote:
> 
> > I do wonder if it will affect run times. Presumably GitHub has made it
> > pretty fast to get things started on the bare runner image. Now we're
> > pulling docker images. That is hopefully pretty optimized and cached,
> > but it is extra work. Might be worth measuring.
> 
> Just peeking at your CI run here:
> 
>   https://github.com/git/git/actions/runs/12597967146
> 
> versus the latest run on Junio's master:
> 
>   https://github.com/git/git/actions/runs/12589300693
> 
> I see:
> 
>   job                 |  old | new
>   --------------------|------|------
>   linux-TEST-vars      11m30s 10m54s
>   linux-asan-ubsan     30m26s 31m14s
>   linux-gcc             9m47s 10m6s
>   linux-gcc-default     9m47s  9m41s
>   linux-leaks          25m50s 25m21s
>   linux-meson          10m36s 10m41s
>   linux-reftable       10m25s 10m23s
>   linux-reftable-leaks 27m18s 27m28s
>   linux-sha256          9m54s 10m31s
> 
> So it looks like any change is lost in the noise (sha256 is noticeably
> slower, but most jobs aren't, and some are even faster).

Thanks for verifying my claims!

Patrick

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-03 18:17   ` Jeff King
@ 2025-01-06 11:12     ` Patrick Steinhardt
  2025-01-07  2:39       ` Jeff King
  2025-01-07  2:48       ` Jeff King
  0 siblings, 2 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:12 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jan 03, 2025 at 01:17:39PM -0500, Jeff King wrote:
> On Fri, Jan 03, 2025 at 03:46:39PM +0100, Patrick Steinhardt wrote:
> > One test in t7422 asserts that `git submodule status --recursive`
> > properly handles SIGPIPE. This test is flaky though and may sometimes
> > not see a SIGPIPE at all:
> > 
> >     expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
> >             { git submodule status --recursive 2>err; echo $?>status; } |
> >                     grep -q X/S &&
> >             test_must_be_empty err &&
> >             test_match_signal 13 "$(cat status)"
> 
> I couldn't reproduce with --stress, but you can trigger it all the time
> with:
> 
> diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
> index f21e920367..9338c75626 100755
> --- a/t/t7422-submodule-output.sh
> +++ b/t/t7422-submodule-output.sh
> @@ -168,7 +168,7 @@ done
>  
>  test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
>  	{ git submodule status --recursive 2>err; echo $?>status; } |
> -		grep -q X/S &&
> +		{ sleep 1 && grep -q X/S; } &&
>  	test_must_be_empty err &&
>  	test_match_signal 13 "$(cat status)"
>  '
> 
> The problem is that git-submodule may write all of its output before
> grep exits, and it gets stored in the pipe buffer. And then even if grep
> exits before reading all of it, it is too late for SIGPIPE, and the data
> in the pipe is just discarded by the OS.
> 
> So this:
> 
> > The issue is caused by us using grep(1) to terminate the pipe on the
> > first matching line in the recursing git-submodule(1) process. Standard
> > streams are typically buffered though, so this condition is racy and may
> > cause us to terminate the pipe after git-submodule(1) has already
> > exited, and in that case we wouldn't see the expected signal.
> > 
> > Fix the issue by converting standard streams to be unbuffered. I have
> > only been able to reproduce this issue a single time after running t7422
> > with `--stress` after an extended amount of time, so I cannot claim to
> > be fully certain that this fix is sufficient.
> 
> isn't quite right. Even without input buffering on grep's part, it may
> be too slow to read the data. And adding a sleep as above shows that it
> still fails with your patch.

Great. I was hoping to nerd-snipe somebody into helping me out with the
last sentence in my above paragraph :) Happy to see that you bit.

> The usual way to reliably get SIGPIPE is to make sure the writer
> produces enough data to fill the pipe buffer. But it's tricky to get
> "submodule status" to produce a lot of data without having a ton of
> submodules, which is expensive to set up.
> 
> But we can hack around it by stuffing the pipe full with a separate
> process. Like this:
> 
> diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
> index f21e920367..c4df2629e8 100755
> --- a/t/t7422-submodule-output.sh
> +++ b/t/t7422-submodule-output.sh
> @@ -167,8 +167,15 @@ do
>  done
>  
>  test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
> -	{ git submodule status --recursive 2>err; echo $?>status; } |
> -		grep -q X/S &&
> +	{
> +		# stuff pipe buffer full of input so that submodule status
> +		# will require blocking on write; this script will write over
> +		# 128kb. It might itself get SIGPIPE, so we must not &&-chain
> +		# it directly.
> +		{ perl -le "print q{foo} for (1..33000)" || true; } &&
> +		git submodule status --recursive 2>err
> +		echo $? >status
> +	} | { sleep 1 && head -n 1 >/dev/null; } &&
>  	test_must_be_empty err &&
>  	test_match_signal 13 "$(cat status)"
>  '
> A few notes:
> 
>   - the sleep is still there to demonstrate that it always works, but
>     obviously we'd want to remove that

Nice, this indeed lets me reproduce the issue reliably.

>   - I swapped out "grep" for "head". What we are matching is not
>     relevant; the important thing is that the reader closes the pipe
>     immediately. So I guess in that sense we could probably even just
>     pipe to "true" or similar.

I think the grep(1) is relevant though. The test explicitly verifies
that `--recursive` propagates SIGPIPE, so we must make sure that we
trigger the SIGPIPE when the child process produces output, not when the
parent process produces it. That's why we grep for "X/S", where "X" is a
submodule -- it means that we know that it is currently the subprocess
doing its thing.

It also simplifies the code a bit given that the call to Perl doesn't
need `|| true` anymore.

>   - I tried using test_seq to avoid the inline perl, but it doesn't
>     work! The problem is that it's implemented as a shell function. So
>     when it gets SIGPIPE, the whole subshell is killed, and we never
>     even run git-submodule at all. So it has to be a separate process
>     (though I guess it could be test_seq in a subshell).

And that one should also work if we retain the grep. I wonder though
whether we shouldn't prefer to use Perl regardless as it's likely to be
faster when generating all that gibberish. Perl is basically a hard
prerequisite for our tests anyway, so it doesn't really hurt to call it
here.

Patrick

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 00/10] A couple of CI improvements
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (10 preceding siblings ...)
  2025-01-03 18:57 ` [PATCH 00/10] A couple of CI improvements Jeff King
@ 2025-01-06 11:16 ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
                     ` (9 more replies)
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
  13 siblings, 10 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

Hi,

this patch series addresses a couple of issues I've found while
investigating flaky CI jobs. Besides two more fixes for flaky jobs it
also removes some stale code and simplifies the setup on GitHub Actions
to always use containerized jobs on Linux.

Test runs can be found for GitLab [1] and GitHub [2].

Changes in v2:

  - Expand a bit on the reasoning behind the conversion to use
    containerized jobs.
  - Fix commit message typo.
  - Properly fix the race in t7422 via pipe stuffing, as proposed by
    Peff.
  - Link to v1: https://lore.kernel.org/r/20250103-b4-pks-ci-fixes-v1-0-a9bb95dff833@pks.im

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/277
[2]: https://github.com/git/git/pull/1865

---
Patrick Steinhardt (10):
      t0060: fix EBUSY in MinGW when setting up runtime prefix
      t7422: fix flaky test caused by buffered stdout
      github: adapt containerized jobs to be rootless
      github: convert all Linux jobs to be containerized
      github: simplify computation of the job's distro
      gitlab-ci: remove the "linux-old" job
      gitlab-ci: add linux32 job testing against i386
      ci: stop special-casing for Ubuntu 16.04
      ci: use latest Ubuntu release
      ci: remove stale code for Azure Pipelines

 .github/workflows/main.yml  | 78 ++++++++++++++++++++++-----------------------
 .gitlab-ci.yml              | 19 ++++++-----
 ci/install-dependencies.sh  |  6 ++--
 ci/lib.sh                   | 34 +++-----------------
 ci/print-test-failures.sh   |  5 ---
 t/t0060-path-utils.sh       | 10 +++---
 t/t7422-submodule-output.sh | 10 ++++--
 7 files changed, 69 insertions(+), 93 deletions(-)

Range-diff versus v1:

 1:  8ef1870c39 =  1:  14a80c2683 t0060: fix EBUSY in MinGW when setting up runtime prefix
 2:  f0647aad30 <  -:  ---------- t7422: fix flaky test caused by buffered stdout
 -:  ---------- >  2:  967e76f482 t7422: fix flaky test caused by buffered stdout
 3:  2768ecb60c =  3:  bd2bae13e4 github: adapt containerized jobs to be rootless
 4:  3a8aafdc32 !  4:  bc0bf7b8d5 github: convert all Linux jobs to be containerized
    @@ Commit message
         The latter is more flexible because it allows us to freely pick whatever
         container image we want to use for a specific job, while the former only
         allows us to pick from a handful of different distros. The containerized
    -    jobs shouldn't cause a significant slowdown, either, so they do not have
    -    any significant upside to the best of my knowlegde. The only upside that
    -    they did have before the preceding commit is that they run as a non-root
    -    user, but that has been addressed now.
    +    jobs do not have any significant downsides to the best of my knowledge:
     
    -    Convert all Linux jobs to be containerized for additional flexibility.
    +      - They aren't significantly slower to start up. A quick comparison by
    +        Peff shows that the difference is mostly lost in the noise:
    +
    +                job             |  old | new
    +            --------------------|------|------
    +            linux-TEST-vars      11m30s 10m54s
    +            linux-asan-ubsan     30m26s 31m14s
    +            linux-gcc             9m47s 10m6s
    +            linux-gcc-default     9m47s  9m41s
    +            linux-leaks          25m50s 25m21s
    +            linux-meson          10m36s 10m41s
    +            linux-reftable       10m25s 10m23s
    +            linux-reftable-leaks 27m18s 27m28s
    +            linux-sha256          9m54s 10m31s
    +
    +        Some jobs are a bit faster, some are a bit slower, but there does
    +        not seem to be any significant change.
    +
    +      - Containerized jobs run as root, which keeps a couple of tests from
    +        running. This has been addressed in the preceding commit though,
    +        where we now use setpriv(1) to run tests as a separate user.
    +
    +      - GitHub injects a Node binary into containerized jobs, which is
    +        dynamically linked. This has led to some issues in the past [1], but
    +        only for our 32 bit jobs. The issues have since been resolved.
    +
    +    Overall there seem to be no downsides, but the upside is that we have
    +    more control over the exact image that these jobs use. Convert the Linux
    +    jobs accordingly.
    +
    +    [1]: https://lore.kernel.org/git/20240912094841.GD589828@coredump.intra.peff.net/
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
 5:  a50ee3dd9a !  5:  22bd775ad0 github: simplify computation of the job's distro
    @@ Commit message
     
         There are a couple of exceptions:
     
    -      - The "linux32" job, w whose distro name is different than the image
    +      - The "linux32" job, whose distro name is different than the image
             name. This is handled by adapting all sites to use the new name.
     
           - The "alpine" and "fedora" jobs, neither of which specify a tag for
 6:  b31305597e =  6:  ddce6be0b6 gitlab-ci: remove the "linux-old" job
 7:  dfa41f5593 =  7:  40a0c1e22a gitlab-ci: add linux32 job testing against i386
 8:  bd1efb0373 =  8:  d775afb9c3 ci: stop special-casing for Ubuntu 16.04
 9:  fa505756a7 =  9:  0dd988643f ci: use latest Ubuntu release
10:  c64af8aa78 = 10:  bdca84eebd ci: remove stale code for Azure Pipelines

---
base-commit: 1b4e9a5f8b5f048972c21fe8acafe0404096f694
change-id: 20250103-b4-pks-ci-fixes-2d0a23fb5c78


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

Two of our tests in t0060 verify that the runtime prefix functionality
works as expected by creating a separate directory hierarchy, copying
the Git executable in there and then creating scripts relative to that
executable.

These tests fail quite regularly in GitLab CI with the following error:

    expecting success of 0060.218 '%(prefix)/ works':
            mkdir -p pretend/bin &&
            cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
            git config yes.path "%(prefix)/yes" &&
            GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
            echo "$(pwd)/pretend/yes" >expect &&
            test_cmp expect actual
    ++ mkdir -p pretend/bin
    ++ cp /c/GitLab-Runner/builds/gitlab-org/git/git.exe pretend/bin/
    cp: cannot create regular file 'pretend/bin/git.exe': Device or resource busy
    error: last command exited with $?=1
    not ok 218 - %(prefix)/ works

Seemingly, the "git.exe" binary we are trying to overwrite is still
being held open. It is somewhat puzzling why exactly that is: while the
preceding test _does_ write to and execute the same path, it should have
exited and shouldn't keep any backgrounded processes around. So it must
be held open by something else, either in MinGW or in Windows itself.

While the root cause is puzzling, the workaround is trivial enough:
instead of writing the file twice we simply pull the common setup into a
separate test case so that we won't observe EBUSY in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t0060-path-utils.sh | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index dbb2e73bcd912ae6a804603ff54e4c609966fa5d..8545cdfab559b4e247cb2699965e637529fd930a 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -592,17 +592,19 @@ test_lazy_prereq CAN_EXEC_IN_PWD '
 	./git rev-parse
 '
 
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'setup runtime prefix' '
+	mkdir -p pretend/bin &&
+	cp "$GIT_EXEC_PATH"/git$X pretend/bin/
+'
+
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
-	mkdir -p pretend/bin pretend/libexec/git-core &&
+	mkdir -p pretend/libexec/git-core &&
 	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	GIT_EXEC_PATH= ./pretend/bin/git here >actual &&
 	echo HERE >expect &&
 	test_cmp expect actual'
 
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
-	mkdir -p pretend/bin &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	git config yes.path "%(prefix)/yes" &&
 	GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
 	echo "$(pwd)/pretend/yes" >expect &&

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-07  2:49     ` Jeff King
  2025-01-06 11:16   ` [PATCH v2 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by us using grep(1) to terminate the pipe on the
first matching line in the recursing git-submodule(1) process. Standard
streams are typically buffered though, so this condition is racy and may
cause us to terminate the pipe after git-submodule(1) has already
exited, and in that case we wouldn't see the expected signal.

Fix the issue by making the writer fill the pipe buffer before we
execute git-submodule(1). Ideally, it would be git-submodule(1) itself
that does produce all that data, but it would require us to create a
large amount of submodules, which is inefficient. Instead, we use Perl
to print gibberish until the buffer is filled.

To verify that this works as expected one can apply the following patch
to the preimage of this commit, which used to reliably trigger the race:

    diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
    index f21e920367..9338c75626 100755
    --- a/t/t7422-submodule-output.sh
    +++ b/t/t7422-submodule-output.sh
    @@ -168,7 +168,7 @@ done

     test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
            { git submodule status --recursive 2>err; echo $?>status; } |
    -		grep -q X/S &&
    +		{ sleep 1 && grep -q X/S; } &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
     '

With the pipe-stuffing workaround the test runs successfully.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7422-submodule-output.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e9203678b94701281d5339ae8bfe53d5de0ed..976f91b0ebd9d82daee3802a212dd3f4031fe86b 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,8 +167,14 @@ do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
+	{
+		# Stuff pipe buffer full of input so that `git submodule
+		# status` will block on write; this script will write over
+		# 128kb.
+		perl -le "print q{foo} for (1..33000)" &&
+		git submodule status --recursive 2>err
+		echo $?>status
+	} | grep -q X/S &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 03/10] github: adapt containerized jobs to be rootless
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

The containerized jobs in GitHub Actions run as root, giving them
special permissions to for example delete files even when the user
shouldn't be able to due to file permissions. This limitation keeps us
from using containerized jobs for most of our Ubuntu-based jobs as it
causes a number of tests to fail.

Adapt the jobs to create a separate user that executes the test suite.
This follows similar infrastructure that we already have in GitLab CI.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 6 ++++--
 ci/install-dependencies.sh | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 900be9957a23fcaa64e1aefd0c8638c5f84b7997..b02f5873a540b458d38e7951b4ee3d5ca598ae23 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -371,10 +371,12 @@ jobs:
       run: apt -q update && apt -q -y install libc6-amd64 lib64stdc++6
     - uses: actions/checkout@v4
     - run: ci/install-dependencies.sh
-    - run: ci/run-build-and-tests.sh
+    - run: useradd builder --create-home
+    - run: chown -R builder .
+    - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
-      run: ci/print-test-failures.sh
+      run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v4
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index d1cb9fa8785388b3674fcea4dd682abc0725c968..ecb5b9d36c20d3e7e96148ac628a96c62642c308 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -31,7 +31,7 @@ alpine-*)
 	;;
 fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
-	dnf -yq install make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
 ubuntu-*|ubuntu32-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 04/10] github: convert all Linux jobs to be containerized
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-01-06 11:16   ` [PATCH v2 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 05/10] github: simplify computation of the job's distro Patrick Steinhardt
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

We have split the CI jobs in GitHub Workflows into two categories:

  - Those running on a machine pool directly.

  - Those running in a container on the machine pool.

The latter is more flexible because it allows us to freely pick whatever
container image we want to use for a specific job, while the former only
allows us to pick from a handful of different distros. The containerized
jobs do not have any significant downsides to the best of my knowledge:

  - They aren't significantly slower to start up. A quick comparison by
    Peff shows that the difference is mostly lost in the noise:

            job             |  old | new
        --------------------|------|------
        linux-TEST-vars      11m30s 10m54s
        linux-asan-ubsan     30m26s 31m14s
        linux-gcc             9m47s 10m6s
        linux-gcc-default     9m47s  9m41s
        linux-leaks          25m50s 25m21s
        linux-meson          10m36s 10m41s
        linux-reftable       10m25s 10m23s
        linux-reftable-leaks 27m18s 27m28s
        linux-sha256          9m54s 10m31s

    Some jobs are a bit faster, some are a bit slower, but there does
    not seem to be any significant change.

  - Containerized jobs run as root, which keeps a couple of tests from
    running. This has been addressed in the preceding commit though,
    where we now use setpriv(1) to run tests as a separate user.

  - GitHub injects a Node binary into containerized jobs, which is
    dynamically linked. This has led to some issues in the past [1], but
    only for our 32 bit jobs. The issues have since been resolved.

Overall there seem to be no downsides, but the upside is that we have
more control over the exact image that these jobs use. Convert the Linux
jobs accordingly.

[1]: https://lore.kernel.org/git/20240912094841.GD589828@coredump.intra.peff.net/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 68 ++++++++++++++++++++++++++--------------------
 1 file changed, 39 insertions(+), 29 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b02f5873a540b458d38e7951b4ee3d5ca598ae23..8e5847da4fab009ad699c18e1a5a336a8b45c3ed 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -259,20 +259,6 @@ jobs:
       fail-fast: false
       matrix:
         vector:
-          - jobname: linux-sha256
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-reftable
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-gcc
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
-          - jobname: linux-TEST-vars
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
           - jobname: osx-clang
             cc: clang
             pool: macos-13
@@ -285,21 +271,6 @@ jobs:
           - jobname: osx-meson
             cc: clang
             pool: macos-13
-          - jobname: linux-gcc-default
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-reftable-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-asan-ubsan
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-meson
-            cc: gcc
-            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
@@ -342,6 +313,44 @@ jobs:
       fail-fast: false
       matrix:
         vector:
+        - jobname: linux-sha256
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-reftable
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-gcc
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-TEST-vars
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-gcc-default
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-reftable-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-asan-ubsan
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-meson
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
         - jobname: linux-musl
           image: alpine
           distro: alpine-latest
@@ -363,6 +372,7 @@ jobs:
     env:
       jobname: ${{matrix.vector.jobname}}
       distro: ${{matrix.vector.distro}}
+      CC: ${{matrix.vector.cc}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 05/10] github: simplify computation of the job's distro
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-01-06 11:16   ` [PATCH v2 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

We explicitly list the distro of Linux-based jobs, but it is equivalent
to the name of the image in almost all cases, except that colons are
replaced with dashes. Drop the redundant information and massage it in
our CI scripts, which is equivalent to how we do it in GitLab CI.

There are a couple of exceptions:

  - The "linux32" job, whose distro name is different than the image
    name. This is handled by adapting all sites to use the new name.

  - The "alpine" and "fedora" jobs, neither of which specify a tag for
    their image. This is handled by adding the "latest" tag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 22 ++++------------------
 ci/install-dependencies.sh |  4 ++--
 ci/lib.sh                  |  2 ++
 3 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 8e5847da4fab009ad699c18e1a5a336a8b45c3ed..b54da639a650682495994e3c7b137eab4e6cb3bf 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -275,7 +275,7 @@ jobs:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.pool}}
+      CI_JOB_IMAGE: ${{matrix.vector.pool}}
       TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t
     runs-on: ${{matrix.vector.pool}}
     steps:
@@ -316,63 +316,49 @@ jobs:
         - jobname: linux-sha256
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-reftable
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-gcc
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-TEST-vars
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-gcc-default
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-reftable-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-asan-ubsan
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-meson
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-musl
-          image: alpine
-          distro: alpine-latest
+          image: alpine:latest
         # Supported until 2025-04-02.
         - jobname: linux32
           image: i386/ubuntu:focal
-          distro: ubuntu32-20.04
         - jobname: pedantic
-          image: fedora
-          distro: fedora-latest
+          image: fedora:latest
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
         - jobname: almalinux-8
           image: almalinux:8
-          distro: almalinux-8
         # Supported until 2026-08-31.
         - jobname: debian-11
           image: debian:11
-          distro: debian-11
     env:
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.distro}}
       CC: ${{matrix.vector.cc}}
+      CI_JOB_IMAGE: ${{matrix.vector.image}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index ecb5b9d36c20d3e7e96148ac628a96c62642c308..d5a959e25ff3236656ff3416b81732ec5c2107c1 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -33,7 +33,7 @@ fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
 	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
-ubuntu-*|ubuntu32-*|debian-*)
+ubuntu-*|i386/ubuntu-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.
 	export DEBIAN_FRONTEND=noninteractive
 
@@ -42,7 +42,7 @@ ubuntu-*|ubuntu32-*|debian-*)
 		SVN='libsvn-perl subversion'
 		LANGUAGES='language-pack-is'
 		;;
-	ubuntu32-*)
+	i386/ubuntu-*)
 		SVN=
 		LANGUAGES='language-pack-is'
 		;;
diff --git a/ci/lib.sh b/ci/lib.sh
index 8885ee3c3f86c62e8783d27756b8779bd491e7e6..f8b68ab8a6546802756fd516ca15a2c97223da5f 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -246,6 +246,8 @@ then
 
 	GIT_TEST_OPTS="--github-workflow-markup"
 	JOBS=10
+
+	distro=$(echo "$CI_JOB_IMAGE" | tr : -)
 elif test true = "$GITLAB_CI"
 then
 	CI_TYPE=gitlab-ci

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 06/10] gitlab-ci: remove the "linux-old" job
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-01-06 11:16   ` [PATCH v2 05/10] github: simplify computation of the job's distro Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

The "linux-old" job was historically testing against the oldest
supported LTS release of Ubuntu. But with c85bcb5de1 (gitlab-ci: switch
from Ubuntu 16.04 to 20.04, 2024-10-31) it has been converted to test
against Ubuntu 20.04, which already gets exercised in a couple of other
CI jobs. It's thus not adding any significant test coverage.

Drop the job.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 9254e01583306e67dc12b6b9e0015183e1108655..00bc727865031620752771af4a9030c7de1b73df 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -36,9 +36,6 @@ test:linux:
       fi
   parallel:
     matrix:
-      - jobname: linux-old
-        image: ubuntu:20.04
-        CC: gcc
       - jobname: linux-sha256
         image: ubuntu:latest
         CC: clang

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 07/10] gitlab-ci: add linux32 job testing against i386
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-01-06 11:16   ` [PATCH v2 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

Add another job to GitLab CI that tests against the i386 architecture.
This job is equivalent to the same job in GitHub Workflows.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 2 ++
 ci/lib.sh      | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 00bc727865031620752771af4a9030c7de1b73df..29e9056dd5010f8843e42aeae8410973c825de54 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -66,6 +66,8 @@ test:linux:
         image: fedora:latest
       - jobname: linux-musl
         image: alpine:latest
+      - jobname: linux32
+        image: i386/ubuntu:20.04
       - jobname: linux-meson
         image: ubuntu:latest
         CC: gcc
diff --git a/ci/lib.sh b/ci/lib.sh
index f8b68ab8a6546802756fd516ca15a2c97223da5f..2293849ada3b45873f80e4392ab93c65657d0f13 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -269,7 +269,7 @@ then
 		CI_OS_NAME=osx
 		JOBS=$(nproc)
 		;;
-	*,alpine:*|*,fedora:*|*,ubuntu:*)
+	*,alpine:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
 		CI_OS_NAME=linux
 		JOBS=$(nproc)
 		;;

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 08/10] ci: stop special-casing for Ubuntu 16.04
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-01-06 11:16   ` [PATCH v2 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 09/10] ci: use latest Ubuntu release Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

With c85bcb5de1 (gitlab-ci: switch from Ubuntu 16.04 to 20.04,
2024-10-31) we have adapted the last CI job to stop using Ubuntu 16.04
in favor of Ubuntu 20.04. Remove the special-casing we still have in our
CI scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 2293849ada3b45873f80e4392ab93c65657d0f13..77a4aabdb8fb416c1733f02d02145b6bc0849998 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -347,14 +347,7 @@ ubuntu-*)
 	fi
 	MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/$PYTHON_PACKAGE"
 
-	case "$distro" in
-	ubuntu-16.04)
-		# Apache is too old for HTTP/2.
-		;;
-	*)
-		export GIT_TEST_HTTPD=true
-		;;
-	esac
+	export GIT_TEST_HTTPD=true
 
 	# The Linux build installs the defined dependency versions below.
 	# The OS X build installs much more recent versions, whichever

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 09/10] ci: use latest Ubuntu release
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2025-01-06 11:16   ` [PATCH v2 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  2025-01-06 11:16   ` [PATCH v2 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

Both GitHub Actions and GitLab CI use the "ubuntu:latest" tag as the
default image for most jobs. This tag is somewhat misleading though, as
it does not refer to the latest release of Ubuntu, but to the latest LTS
release thereof. But as we already have a couple of jobs exercising the
oldest LTS release of Ubuntu that Git still supports, it would make more
sense to test the oldest and youngest versions of Ubuntu.

Adapt these jobs to instead use the "ubuntu:rolling" tag, which refers
to the actual latest release, which currently is Ubuntu 24.10.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 14 +++++++-------
 .gitlab-ci.yml             | 14 +++++++-------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b54da639a650682495994e3c7b137eab4e6cb3bf..b90381ae015edf9db5aa4b8c0ace9bb5c549c37b 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -314,10 +314,10 @@ jobs:
       matrix:
         vector:
         - jobname: linux-sha256
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-reftable
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-gcc
           image: ubuntu:20.04
@@ -328,19 +328,19 @@ jobs:
           cc: gcc
           cc_package: gcc-8
         - jobname: linux-gcc-default
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-reftable-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-asan-ubsan
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-meson
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-musl
           image: alpine:latest
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 29e9056dd5010f8843e42aeae8410973c825de54..8ed3ff5f0373d70b6f609dc5292dda2dd7fd8f88 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -37,10 +37,10 @@ test:linux:
   parallel:
     matrix:
       - jobname: linux-sha256
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-reftable
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-gcc
         image: ubuntu:20.04
@@ -51,16 +51,16 @@ test:linux:
         CC: gcc
         CC_PACKAGE: gcc-8
       - jobname: linux-gcc-default
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-reftable-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-asan-ubsan
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: pedantic
         image: fedora:latest
@@ -69,7 +69,7 @@ test:linux:
       - jobname: linux32
         image: i386/ubuntu:20.04
       - jobname: linux-meson
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
   artifacts:
     paths:

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 10/10] ci: remove stale code for Azure Pipelines
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2025-01-06 11:16   ` [PATCH v2 09/10] ci: use latest Ubuntu release Patrick Steinhardt
@ 2025-01-06 11:16   ` Patrick Steinhardt
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 11:16 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

Support for Azure Pipelines has been retired in 6081d3898f (ci: retire
the Azure Pipelines definition, 2020-04-11) in favor of GitHub Actions.
Our CI library still has some infrastructure left for Azure though that
is now unused. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh                 | 21 +--------------------
 ci/print-test-failures.sh |  5 -----
 2 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 77a4aabdb8fb416c1733f02d02145b6bc0849998..4003354f16c048b969c0bb4340d2ee2777767300 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -206,26 +206,7 @@ export TERM=${TERM:-dumb}
 # Clear MAKEFLAGS that may come from the outside world.
 export MAKEFLAGS=
 
-if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
-then
-	CI_TYPE=azure-pipelines
-	# We are running in Azure Pipelines
-	CI_BRANCH="$BUILD_SOURCEBRANCH"
-	CI_COMMIT="$BUILD_SOURCEVERSION"
-	CI_JOB_ID="$BUILD_BUILDID"
-	CI_JOB_NUMBER="$BUILD_BUILDNUMBER"
-	CI_OS_NAME="$(echo "$AGENT_OS" | tr A-Z a-z)"
-	test darwin != "$CI_OS_NAME" || CI_OS_NAME=osx
-	CI_REPO_SLUG="$(expr "$BUILD_REPOSITORY_URI" : '.*/\([^/]*/[^/]*\)$')"
-	CC="${CC:-gcc}"
-
-	# use a subdirectory of the cache dir (because the file share is shared
-	# among *all* phases)
-	cache_dir="$HOME/test-cache/$SYSTEM_PHASENAME"
-
-	GIT_TEST_OPTS="--write-junit-xml"
-	JOBS=10
-elif test true = "$GITHUB_ACTIONS"
+if test true = "$GITHUB_ACTIONS"
 then
 	CI_TYPE=github-actions
 	CI_BRANCH="$GITHUB_REF"
diff --git a/ci/print-test-failures.sh b/ci/print-test-failures.sh
index 655687dd827e5b3e4d4879803b0d4499e7751380..dc910e51609cd7344b1ad03fdb4e820e47ad3a88 100755
--- a/ci/print-test-failures.sh
+++ b/ci/print-test-failures.sh
@@ -39,11 +39,6 @@ do
 		test_name="${test_name##*/}"
 		trash_dir="trash directory.$test_name"
 		case "$CI_TYPE" in
-		azure-pipelines)
-			mkdir -p failed-test-artifacts
-			mv "$trash_dir" failed-test-artifacts
-			continue
-			;;
 		github-actions)
 			mkdir -p failed-test-artifacts
 			echo "FAILED_TEST_ARTIFACTS=${TEST_OUTPUT_DIRECTORY:t}/failed-test-artifacts" >>$GITHUB_ENV

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-06 11:12     ` Patrick Steinhardt
@ 2025-01-07  2:39       ` Jeff King
  2025-01-07  8:47         ` Patrick Steinhardt
  2025-01-07  2:48       ` Jeff King
  1 sibling, 1 reply; 69+ messages in thread
From: Jeff King @ 2025-01-07  2:39 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Mon, Jan 06, 2025 at 12:12:22PM +0100, Patrick Steinhardt wrote:

> > isn't quite right. Even without input buffering on grep's part, it may
> > be too slow to read the data. And adding a sleep as above shows that it
> > still fails with your patch.
> 
> Great. I was hoping to nerd-snipe somebody into helping me out with the
> last sentence in my above paragraph :) Happy to see that you bit.

I think I am a sucker for SIGPIPE races.

> >   - I swapped out "grep" for "head". What we are matching is not
> >     relevant; the important thing is that the reader closes the pipe
> >     immediately. So I guess in that sense we could probably even just
> >     pipe to "true" or similar.
> 
> I think the grep(1) is relevant though. The test explicitly verifies
> that `--recursive` propagates SIGPIPE, so we must make sure that we
> trigger the SIGPIPE when the child process produces output, not when the
> parent process produces it. That's why we grep for "X/S", where "X" is a
> submodule -- it means that we know that it is currently the subprocess
> doing its thing.

Hmm, I see what you mean. I don't think we can do that reliably, though,
or that the perl byte-stuffing is actually helping.

As I wrote it, perl always gets SIGPIPE first (because either "head"
exits while it is writing, or it fills up the pipe buffer and blocks,
waiting for head to exit, and then sees the pipe close).

And thus when we run git-submodule, the pipe is reliably closed and
we'll see SIGPIPE.

But with grep, that does not happen. The grep will run through all of
the data from perl (since it does not contain X/S), and there will not
be anything left in the pipe buffer by the time git-submodule starts. So
all of that data did nothing (though it fools the "sleep 1 && grep" from
losing the race because perl will block until grep starts, after the
sleep is finished).

And so we're left with the same race as before. git-submodule writes the
X/S line, grep reads it and then tries to exit while git-submodule is
writing more. And either:

  a. grep may exit immediately, before git-submodule writes any more
     data. In which case git sees SIGPIPE, which is what we want.

  a. git-submodule may write all of its data before grep exits. It will
     not block, because all of the stuff perl put in the buffer is long
     since gone, having been read by grep already. The data goes into
     the pipe buffer, and git-submodule has no idea it is discarded when
     grep exits. The test fails.

It's hard to simulate this one with a sleep, because it requires either
git-submodule to write quickly, or for grep to be slow after reading the
matching line but before exiting.

For the latter you can do:

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index 976f91b0eb..e2961e57dc 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -174,7 +174,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
 		perl -le "print q{foo} for (1..33000)" &&
 		git submodule status --recursive 2>err
 		echo $?>status
-	} | grep -q X/S &&
+	} | { grep -q X/S && sleep 1; } &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '

on top of your patch, which reliably fails the test. I know that looks
kind of ridiculous and fake, but you can imagine it as that first grep
just taking a long time to call exit() and close the pipe.

It's hard to make git-submodule faster, because its output is really
coming from recursive invocations of itself. But you could imagine a
world where we do the submodule recursion in a single process, buffering
it via stdio, and then write all of the lines at once. And then
git-submodule always wins the race (it issues a single write() syscall
and then exits), and the test fails.

To make the test reliable, you'd need to pause or fill the pipe buffer
_after_ writing X/S via git-submodule but before writing the rest of the
data. Or to perhaps convince git-submodule only to write the recursive
data, and then pre-stuff the pipe as I suggested earlier. But I'm not
sure how to do the latter. Even if we ask for:

  git submodule status --recursive -- X

it will print out the status of "X" before recursing into it to show
X/S, etc, which will give us SIGPIPE in the parent submodule process,
not the recursive one.

For the former, I guess you'd need some hook that runs when we recurse
into the submodule and dumps a bunch of garbage into the pipe buffer.
But I don't think there is any such hook that runs here. Unless perhaps
you abused core.fsmonitor or something, but I don't think that's
portable.

So I don't really see a way to do this robustly.

-Peff

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-06 11:12     ` Patrick Steinhardt
  2025-01-07  2:39       ` Jeff King
@ 2025-01-07  2:48       ` Jeff King
  2025-01-07 16:02         ` Junio C Hamano
  1 sibling, 1 reply; 69+ messages in thread
From: Jeff King @ 2025-01-07  2:48 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Mon, Jan 06, 2025 at 12:12:22PM +0100, Patrick Steinhardt wrote:

> >   - I tried using test_seq to avoid the inline perl, but it doesn't
> >     work! The problem is that it's implemented as a shell function. So
> >     when it gets SIGPIPE, the whole subshell is killed, and we never
> >     even run git-submodule at all. So it has to be a separate process
> >     (though I guess it could be test_seq in a subshell).
> 
> And that one should also work if we retain the grep. I wonder though
> whether we shouldn't prefer to use Perl regardless as it's likely to be
> faster when generating all that gibberish. Perl is basically a hard
> prerequisite for our tests anyway, so it doesn't really hurt to call it
> here.

I don't think we should pursue this direction any more because we need
to get the SIGPIPE mid-way through the git-submodule command (see the
other message I just sent).

But because it is a basic technique for establishing a reliable SIGPIPE,
and we might end up using it elsewhere, I thought I'd post a slightly
improved version.

The two things I didn't like about what I posted earlier were:

  - the guess at the pipe buffer size. 128k is probably enough in
    practice, but it's not guaranteed.

  - piping to "head" actually made our buffer size guess worse. We know
    that "head" is going to read the first line and then exit. But how
    much more data might it read? It might easily buffer 4k or even 8k,
    leaving the buffer not-quite full.

So I think a simpler and more robust version is just this:

  {
	{ yes || true; } &&
	command_expecting_sigpipe; echo $? >status
  } | true

We'll keep producing data in "yes" until the pipe is closed. So it will
closed before command_expecting_sigpipe even starts, and there is no
race there. And because we're using "true" on the right-hand side of the
pipe, nothing is read at all from the pipe. So there's no guessing about
how much might have been read.

And it works no matter how slow the right-hand side of the pipe is
(e.g., you can add a "sleep 1" there and it still works).

Like I said, this won't help our current situation, but after having
spent a little time on it (before realizing that) I figured it was worth
documenting.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-06 11:16   ` [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-07  2:49     ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2025-01-07  2:49 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano

On Mon, Jan 06, 2025 at 12:16:51PM +0100, Patrick Steinhardt wrote:

> Fix the issue by making the writer fill the pipe buffer before we
> execute git-submodule(1). Ideally, it would be git-submodule(1) itself
> that does produce all that data, but it would require us to create a
> large amount of submodules, which is inefficient. Instead, we use Perl
> to print gibberish until the buffer is filled.
> 
> To verify that this works as expected one can apply the following patch
> to the preimage of this commit, which used to reliably trigger the race:
> 
>     diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
>     index f21e920367..9338c75626 100755
>     --- a/t/t7422-submodule-output.sh
>     +++ b/t/t7422-submodule-output.sh
>     @@ -168,7 +168,7 @@ done
> 
>      test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
>             { git submodule status --recursive 2>err; echo $?>status; } |
>     -		grep -q X/S &&
>     +		{ sleep 1 && grep -q X/S; } &&
>             test_must_be_empty err &&
>             test_match_signal 13 "$(cat status)"
>      '
> 
> With the pipe-stuffing workaround the test runs successfully.

Sadly this isn't enough. The pipe-stuffing solves the race with grep
_starting_ (and thus the extra "sleep"), but the fundamental race we've
seen in practice still remains. See my reply the v1 thread for details.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-07  2:39       ` Jeff King
@ 2025-01-07  8:47         ` Patrick Steinhardt
  2025-01-07  8:50           ` Patrick Steinhardt
  2025-01-09  7:17           ` Jeff King
  0 siblings, 2 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07  8:47 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Phillip Wood

On Mon, Jan 06, 2025 at 09:39:04PM -0500, Jeff King wrote:
> So I don't really see a way to do this robustly.

I think I found a way, which goes back to the inital idea of just
generating heaps of submodules. My current version generates a submodule
"A" with a couple of recursive submodules followed by 2.5k additional
submodules, which overall generates ~150kB of data. This can be done
somewhat efficiently via git-hash-object-object(1) and git-mktree(1),
and things work with a sleep before and after the call to grep(1).

I'm a bit torn though. The required setup is quite complex, and I wonder
whether it is really worth it just to test this edge case. On the other
hand it is there to cover a recent fix in 082caf527e (submodule status:
propagate SIGPIPE, 2024-09-20), so losing the test coverage isn't all
that great, either. And keeping the race is not an option to me, either.

So I'm inclined to go with the below version. WDYT?

Patrick


diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e920367..fbfc60936c 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,10 +167,38 @@ do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
-	test_must_be_empty err &&
-	test_match_signal 13 "$(cat status)"
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git clone . subrepo &&
+
+		COMMIT=$(git rev-parse HEAD) &&
+		for i in $(test_seq 2500)
+		do
+			printf "[submodule \"sm-$i\"]\npath = submodule-path-$i\n" "$i" ||
+			return 1
+		done >gitmodules &&
+		BLOB=$(git hash-object -w --stdin <gitmodules) &&
+
+		printf "100644 blob $BLOB\t.gitmodules\n" >tree &&
+		for i in $(test_seq 2500)
+		do
+			printf "160000 commit $COMMIT\tsubmodule-path-%d\n" "$i" ||
+			return 1
+		done >>tree &&
+		TREE=$(git mktree <tree) &&
+
+		COMMIT=$(git commit-tree "$TREE") &&
+		git reset --hard "$COMMIT" &&
+		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../X A &&
+
+		{ git submodule status --recursive 2>err; echo $?>status; } |
+			{ sleep 1 && grep -q A/S && sleep 1; } &&
+		test_must_be_empty err &&
+		test_match_signal 13 "$(cat status)"
+	)
 '
 
 test_done


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-07  8:47         ` Patrick Steinhardt
@ 2025-01-07  8:50           ` Patrick Steinhardt
  2025-01-09  7:17           ` Jeff King
  1 sibling, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07  8:50 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Phillip Wood

On Tue, Jan 07, 2025 at 09:47:44AM +0100, Patrick Steinhardt wrote:
> On Mon, Jan 06, 2025 at 09:39:04PM -0500, Jeff King wrote:
> > So I don't really see a way to do this robustly.
> 
> I think I found a way, which goes back to the inital idea of just
> generating heaps of submodules. My current version generates a submodule
> "A" with a couple of recursive submodules followed by 2.5k additional
> submodules, which overall generates ~150kB of data. This can be done
> somewhat efficiently via git-hash-object-object(1) and git-mktree(1),
> and things work with a sleep before and after the call to grep(1).
> 
> I'm a bit torn though. The required setup is quite complex, and I wonder
> whether it is really worth it just to test this edge case. On the other
> hand it is there to cover a recent fix in 082caf527e (submodule status:
> propagate SIGPIPE, 2024-09-20), so losing the test coverage isn't all
> that great, either. And keeping the race is not an option to me, either.
> 
> So I'm inclined to go with the below version. WDYT?

Gah, this of course needs to be adapted so that it is the submodule that
contains 2.5k recursive submodules. But the idea would still work.

Patrick

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v3 00/10] A couple of CI improvements
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (11 preceding siblings ...)
  2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
@ 2025-01-07 12:30 ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
                     ` (9 more replies)
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
  13 siblings, 10 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Hi,

this patch series addresses a couple of issues I've found while
investigating flaky CI jobs. Besides two more fixes for flaky jobs it
also removes some stale code and simplifies the setup on GitHub Actions
to always use containerized jobs on Linux.

Test runs can be found for GitLab [1] and GitHub [2].

Changes in v2:

  - Expand a bit on the reasoning behind the conversion to use
    containerized jobs.
  - Fix commit message typo.
  - Properly fix the race in t7422 via pipe stuffing, as proposed by
    Peff.
  - Link to v1: https://lore.kernel.org/r/20250103-b4-pks-ci-fixes-v1-0-a9bb95dff833@pks.im

Changes in v3:

  - Another iteration on the SIGPIPE test, which should now finally plug
    the race.
  - Link to v2: https://lore.kernel.org/r/20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@pks.im

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/277
[2]: https://github.com/git/git/pull/1865

---
Patrick Steinhardt (10):
      t0060: fix EBUSY in MinGW when setting up runtime prefix
      t7422: fix flaky test caused by buffered stdout
      github: adapt containerized jobs to be rootless
      github: convert all Linux jobs to be containerized
      github: simplify computation of the job's distro
      gitlab-ci: remove the "linux-old" job
      gitlab-ci: add linux32 job testing against i386
      ci: stop special-casing for Ubuntu 16.04
      ci: use latest Ubuntu release
      ci: remove stale code for Azure Pipelines

 .github/workflows/main.yml  | 78 ++++++++++++++++++++++-----------------------
 .gitlab-ci.yml              | 19 ++++++-----
 ci/install-dependencies.sh  |  6 ++--
 ci/lib.sh                   | 34 +++-----------------
 ci/print-test-failures.sh   |  5 ---
 t/t0060-path-utils.sh       | 10 +++---
 t/t7422-submodule-output.sh | 43 ++++++++++++++++++++++---
 7 files changed, 100 insertions(+), 95 deletions(-)

Range-diff versus v2:

 1:  924cc137a7 =  1:  2d20c22e1c t0060: fix EBUSY in MinGW when setting up runtime prefix
 2:  85f732d57f <  -:  ---------- t7422: fix flaky test caused by buffered stdout
 -:  ---------- >  2:  97e94a22d0 t7422: fix flaky test caused by buffered stdout
 3:  96ffed8ad9 =  3:  5f105f2d04 github: adapt containerized jobs to be rootless
 4:  a276e87563 =  4:  ffcb18fe34 github: convert all Linux jobs to be containerized
 5:  736d660c31 =  5:  e7a9dc276c github: simplify computation of the job's distro
 6:  0c2e227cb8 =  6:  7e1f6b651a gitlab-ci: remove the "linux-old" job
 7:  60c3cb0d76 =  7:  03b4a82fc0 gitlab-ci: add linux32 job testing against i386
 8:  216c043aac =  8:  df57d16eb9 ci: stop special-casing for Ubuntu 16.04
 9:  234b741805 =  9:  7c63294ace ci: use latest Ubuntu release
10:  421852878a = 10:  d47387d596 ci: remove stale code for Azure Pipelines

---
base-commit: 1b4e9a5f8b5f048972c21fe8acafe0404096f694
change-id: 20250103-b4-pks-ci-fixes-2d0a23fb5c78


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v3 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Two of our tests in t0060 verify that the runtime prefix functionality
works as expected by creating a separate directory hierarchy, copying
the Git executable in there and then creating scripts relative to that
executable.

These tests fail quite regularly in GitLab CI with the following error:

    expecting success of 0060.218 '%(prefix)/ works':
            mkdir -p pretend/bin &&
            cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
            git config yes.path "%(prefix)/yes" &&
            GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
            echo "$(pwd)/pretend/yes" >expect &&
            test_cmp expect actual
    ++ mkdir -p pretend/bin
    ++ cp /c/GitLab-Runner/builds/gitlab-org/git/git.exe pretend/bin/
    cp: cannot create regular file 'pretend/bin/git.exe': Device or resource busy
    error: last command exited with $?=1
    not ok 218 - %(prefix)/ works

Seemingly, the "git.exe" binary we are trying to overwrite is still
being held open. It is somewhat puzzling why exactly that is: while the
preceding test _does_ write to and execute the same path, it should have
exited and shouldn't keep any backgrounded processes around. So it must
be held open by something else, either in MinGW or in Windows itself.

While the root cause is puzzling, the workaround is trivial enough:
instead of writing the file twice we simply pull the common setup into a
separate test case so that we won't observe EBUSY in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t0060-path-utils.sh | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index dbb2e73bcd912ae6a804603ff54e4c609966fa5d..8545cdfab559b4e247cb2699965e637529fd930a 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -592,17 +592,19 @@ test_lazy_prereq CAN_EXEC_IN_PWD '
 	./git rev-parse
 '
 
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'setup runtime prefix' '
+	mkdir -p pretend/bin &&
+	cp "$GIT_EXEC_PATH"/git$X pretend/bin/
+'
+
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
-	mkdir -p pretend/bin pretend/libexec/git-core &&
+	mkdir -p pretend/libexec/git-core &&
 	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	GIT_EXEC_PATH= ./pretend/bin/git here >actual &&
 	echo HERE >expect &&
 	test_cmp expect actual'
 
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
-	mkdir -p pretend/bin &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	git config yes.path "%(prefix)/yes" &&
 	GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
 	echo "$(pwd)/pretend/yes" >expect &&

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-09  7:33     ` Jeff King
  2025-01-07 12:30   ` [PATCH v3 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by us using grep(1) to terminate the pipe on the
first matching line in the recursing git-submodule(1) process. Standard
streams are typically buffered though, so this condition is racy and may
cause us to terminate the pipe after git-submodule(1) has already
exited, and in that case we wouldn't see the expected signal.

Fix the issue by generating a couple thousand nested submodules and
matching on the first nested submodule. This ensures that the recursive
git-submodule(1) process completely fills its stdout buffer, which makes
subsequent writes block until the downstream consumer of the pipe either
fully drains it or closes it.

To verify that this works as expected one can apply the following patch
to the preimage of this commit, which used to reliably trigger the race:

    diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
    index 3c5177cc30..df6001f8a0 100755
    --- a/t/t7422-submodule-output.sh
    +++ b/t/t7422-submodule-output.sh
    @@ -202,7 +202,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
     		cd repo &&
     		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
     		{ git submodule status --recursive 2>err; echo $?>status; } |
    -			grep -q recursive-submodule-path-1 &&
    +			{ sleep 1 && grep -q recursive-submodule-path-1 && sleep 1; } &&
     		test_must_be_empty err &&
     		test_match_signal 13 "$(cat status)"
     	)

With the pipe-stuffing workaround the test runs successfully.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7422-submodule-output.sh | 43 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e9203678b94701281d5339ae8bfe53d5de0ed..023a5cbdc44bac2389fca45cf7017750627c4ce9 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,10 +167,45 @@ do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
-	test_must_be_empty err &&
-	test_match_signal 13 "$(cat status)"
+	# The test setup is somewhat involved because triggering a SIGPIPE is
+	# racy with buffered pipes. To avoid the raciness we thus need to make
+	# sure that the subprocess in question fills the buffers completely,
+	# which requires a couple thousand submodules in total.
+	test_when_finished "rm -rf submodule repo" &&
+	git init submodule &&
+	(
+		cd submodule &&
+		test_commit initial &&
+
+		COMMIT=$(git rev-parse HEAD) &&
+		for i in $(test_seq 2000)
+		do
+			printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" ||
+			return 1
+		done >gitmodules &&
+		BLOB=$(git hash-object -w --stdin <gitmodules) &&
+
+		printf "100644 blob $BLOB\t.gitmodules\n" >tree &&
+		for i in $(test_seq 2000)
+		do
+			printf "160000 commit $COMMIT\trecursive-submodule-path-%d\n" "$i" ||
+			return 1
+		done >>tree &&
+		TREE=$(git mktree <tree) &&
+
+		COMMIT=$(git commit-tree "$TREE") &&
+		git reset --hard "$COMMIT"
+	) &&
+
+	git init repo &&
+	(
+		cd repo &&
+		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
+		{ git submodule status --recursive 2>err; echo $?>status; } |
+			grep -q recursive-submodule-path-1 &&
+		test_must_be_empty err &&
+		test_match_signal 13 "$(cat status)"
+	)
 '
 
 test_done

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 03/10] github: adapt containerized jobs to be rootless
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

The containerized jobs in GitHub Actions run as root, giving them
special permissions to for example delete files even when the user
shouldn't be able to due to file permissions. This limitation keeps us
from using containerized jobs for most of our Ubuntu-based jobs as it
causes a number of tests to fail.

Adapt the jobs to create a separate user that executes the test suite.
This follows similar infrastructure that we already have in GitLab CI.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 6 ++++--
 ci/install-dependencies.sh | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 900be9957a23fcaa64e1aefd0c8638c5f84b7997..b02f5873a540b458d38e7951b4ee3d5ca598ae23 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -371,10 +371,12 @@ jobs:
       run: apt -q update && apt -q -y install libc6-amd64 lib64stdc++6
     - uses: actions/checkout@v4
     - run: ci/install-dependencies.sh
-    - run: ci/run-build-and-tests.sh
+    - run: useradd builder --create-home
+    - run: chown -R builder .
+    - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
-      run: ci/print-test-failures.sh
+      run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v4
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index d1cb9fa8785388b3674fcea4dd682abc0725c968..ecb5b9d36c20d3e7e96148ac628a96c62642c308 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -31,7 +31,7 @@ alpine-*)
 	;;
 fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
-	dnf -yq install make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
 ubuntu-*|ubuntu32-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 04/10] github: convert all Linux jobs to be containerized
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-01-07 12:30   ` [PATCH v3 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 05/10] github: simplify computation of the job's distro Patrick Steinhardt
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

We have split the CI jobs in GitHub Workflows into two categories:

  - Those running on a machine pool directly.

  - Those running in a container on the machine pool.

The latter is more flexible because it allows us to freely pick whatever
container image we want to use for a specific job, while the former only
allows us to pick from a handful of different distros. The containerized
jobs do not have any significant downsides to the best of my knowledge:

  - They aren't significantly slower to start up. A quick comparison by
    Peff shows that the difference is mostly lost in the noise:

            job             |  old | new
        --------------------|------|------
        linux-TEST-vars      11m30s 10m54s
        linux-asan-ubsan     30m26s 31m14s
        linux-gcc             9m47s 10m6s
        linux-gcc-default     9m47s  9m41s
        linux-leaks          25m50s 25m21s
        linux-meson          10m36s 10m41s
        linux-reftable       10m25s 10m23s
        linux-reftable-leaks 27m18s 27m28s
        linux-sha256          9m54s 10m31s

    Some jobs are a bit faster, some are a bit slower, but there does
    not seem to be any significant change.

  - Containerized jobs run as root, which keeps a couple of tests from
    running. This has been addressed in the preceding commit though,
    where we now use setpriv(1) to run tests as a separate user.

  - GitHub injects a Node binary into containerized jobs, which is
    dynamically linked. This has led to some issues in the past [1], but
    only for our 32 bit jobs. The issues have since been resolved.

Overall there seem to be no downsides, but the upside is that we have
more control over the exact image that these jobs use. Convert the Linux
jobs accordingly.

[1]: https://lore.kernel.org/git/20240912094841.GD589828@coredump.intra.peff.net/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 68 ++++++++++++++++++++++++++--------------------
 1 file changed, 39 insertions(+), 29 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b02f5873a540b458d38e7951b4ee3d5ca598ae23..8e5847da4fab009ad699c18e1a5a336a8b45c3ed 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -259,20 +259,6 @@ jobs:
       fail-fast: false
       matrix:
         vector:
-          - jobname: linux-sha256
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-reftable
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-gcc
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
-          - jobname: linux-TEST-vars
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
           - jobname: osx-clang
             cc: clang
             pool: macos-13
@@ -285,21 +271,6 @@ jobs:
           - jobname: osx-meson
             cc: clang
             pool: macos-13
-          - jobname: linux-gcc-default
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-reftable-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-asan-ubsan
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-meson
-            cc: gcc
-            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
@@ -342,6 +313,44 @@ jobs:
       fail-fast: false
       matrix:
         vector:
+        - jobname: linux-sha256
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-reftable
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-gcc
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-TEST-vars
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-gcc-default
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-reftable-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-asan-ubsan
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-meson
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
         - jobname: linux-musl
           image: alpine
           distro: alpine-latest
@@ -363,6 +372,7 @@ jobs:
     env:
       jobname: ${{matrix.vector.jobname}}
       distro: ${{matrix.vector.distro}}
+      CC: ${{matrix.vector.cc}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 05/10] github: simplify computation of the job's distro
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-01-07 12:30   ` [PATCH v3 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

We explicitly list the distro of Linux-based jobs, but it is equivalent
to the name of the image in almost all cases, except that colons are
replaced with dashes. Drop the redundant information and massage it in
our CI scripts, which is equivalent to how we do it in GitLab CI.

There are a couple of exceptions:

  - The "linux32" job, whose distro name is different than the image
    name. This is handled by adapting all sites to use the new name.

  - The "alpine" and "fedora" jobs, neither of which specify a tag for
    their image. This is handled by adding the "latest" tag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 22 ++++------------------
 ci/install-dependencies.sh |  4 ++--
 ci/lib.sh                  |  2 ++
 3 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 8e5847da4fab009ad699c18e1a5a336a8b45c3ed..b54da639a650682495994e3c7b137eab4e6cb3bf 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -275,7 +275,7 @@ jobs:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.pool}}
+      CI_JOB_IMAGE: ${{matrix.vector.pool}}
       TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t
     runs-on: ${{matrix.vector.pool}}
     steps:
@@ -316,63 +316,49 @@ jobs:
         - jobname: linux-sha256
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-reftable
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-gcc
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-TEST-vars
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-gcc-default
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-reftable-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-asan-ubsan
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-meson
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-musl
-          image: alpine
-          distro: alpine-latest
+          image: alpine:latest
         # Supported until 2025-04-02.
         - jobname: linux32
           image: i386/ubuntu:focal
-          distro: ubuntu32-20.04
         - jobname: pedantic
-          image: fedora
-          distro: fedora-latest
+          image: fedora:latest
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
         - jobname: almalinux-8
           image: almalinux:8
-          distro: almalinux-8
         # Supported until 2026-08-31.
         - jobname: debian-11
           image: debian:11
-          distro: debian-11
     env:
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.distro}}
       CC: ${{matrix.vector.cc}}
+      CI_JOB_IMAGE: ${{matrix.vector.image}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index ecb5b9d36c20d3e7e96148ac628a96c62642c308..d5a959e25ff3236656ff3416b81732ec5c2107c1 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -33,7 +33,7 @@ fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
 	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
-ubuntu-*|ubuntu32-*|debian-*)
+ubuntu-*|i386/ubuntu-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.
 	export DEBIAN_FRONTEND=noninteractive
 
@@ -42,7 +42,7 @@ ubuntu-*|ubuntu32-*|debian-*)
 		SVN='libsvn-perl subversion'
 		LANGUAGES='language-pack-is'
 		;;
-	ubuntu32-*)
+	i386/ubuntu-*)
 		SVN=
 		LANGUAGES='language-pack-is'
 		;;
diff --git a/ci/lib.sh b/ci/lib.sh
index 8885ee3c3f86c62e8783d27756b8779bd491e7e6..f8b68ab8a6546802756fd516ca15a2c97223da5f 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -246,6 +246,8 @@ then
 
 	GIT_TEST_OPTS="--github-workflow-markup"
 	JOBS=10
+
+	distro=$(echo "$CI_JOB_IMAGE" | tr : -)
 elif test true = "$GITLAB_CI"
 then
 	CI_TYPE=gitlab-ci

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 06/10] gitlab-ci: remove the "linux-old" job
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-01-07 12:30   ` [PATCH v3 05/10] github: simplify computation of the job's distro Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

The "linux-old" job was historically testing against the oldest
supported LTS release of Ubuntu. But with c85bcb5de1 (gitlab-ci: switch
from Ubuntu 16.04 to 20.04, 2024-10-31) it has been converted to test
against Ubuntu 20.04, which already gets exercised in a couple of other
CI jobs. It's thus not adding any significant test coverage.

Drop the job.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 9254e01583306e67dc12b6b9e0015183e1108655..00bc727865031620752771af4a9030c7de1b73df 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -36,9 +36,6 @@ test:linux:
       fi
   parallel:
     matrix:
-      - jobname: linux-old
-        image: ubuntu:20.04
-        CC: gcc
       - jobname: linux-sha256
         image: ubuntu:latest
         CC: clang

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 07/10] gitlab-ci: add linux32 job testing against i386
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-01-07 12:30   ` [PATCH v3 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Add another job to GitLab CI that tests against the i386 architecture.
This job is equivalent to the same job in GitHub Workflows.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 2 ++
 ci/lib.sh      | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 00bc727865031620752771af4a9030c7de1b73df..29e9056dd5010f8843e42aeae8410973c825de54 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -66,6 +66,8 @@ test:linux:
         image: fedora:latest
       - jobname: linux-musl
         image: alpine:latest
+      - jobname: linux32
+        image: i386/ubuntu:20.04
       - jobname: linux-meson
         image: ubuntu:latest
         CC: gcc
diff --git a/ci/lib.sh b/ci/lib.sh
index f8b68ab8a6546802756fd516ca15a2c97223da5f..2293849ada3b45873f80e4392ab93c65657d0f13 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -269,7 +269,7 @@ then
 		CI_OS_NAME=osx
 		JOBS=$(nproc)
 		;;
-	*,alpine:*|*,fedora:*|*,ubuntu:*)
+	*,alpine:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
 		CI_OS_NAME=linux
 		JOBS=$(nproc)
 		;;

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 08/10] ci: stop special-casing for Ubuntu 16.04
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-01-07 12:30   ` [PATCH v3 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 09/10] ci: use latest Ubuntu release Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

With c85bcb5de1 (gitlab-ci: switch from Ubuntu 16.04 to 20.04,
2024-10-31) we have adapted the last CI job to stop using Ubuntu 16.04
in favor of Ubuntu 20.04. Remove the special-casing we still have in our
CI scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 2293849ada3b45873f80e4392ab93c65657d0f13..77a4aabdb8fb416c1733f02d02145b6bc0849998 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -347,14 +347,7 @@ ubuntu-*)
 	fi
 	MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/$PYTHON_PACKAGE"
 
-	case "$distro" in
-	ubuntu-16.04)
-		# Apache is too old for HTTP/2.
-		;;
-	*)
-		export GIT_TEST_HTTPD=true
-		;;
-	esac
+	export GIT_TEST_HTTPD=true
 
 	# The Linux build installs the defined dependency versions below.
 	# The OS X build installs much more recent versions, whichever

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 09/10] ci: use latest Ubuntu release
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2025-01-07 12:30   ` [PATCH v3 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  2025-01-07 12:30   ` [PATCH v3 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Both GitHub Actions and GitLab CI use the "ubuntu:latest" tag as the
default image for most jobs. This tag is somewhat misleading though, as
it does not refer to the latest release of Ubuntu, but to the latest LTS
release thereof. But as we already have a couple of jobs exercising the
oldest LTS release of Ubuntu that Git still supports, it would make more
sense to test the oldest and youngest versions of Ubuntu.

Adapt these jobs to instead use the "ubuntu:rolling" tag, which refers
to the actual latest release, which currently is Ubuntu 24.10.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 14 +++++++-------
 .gitlab-ci.yml             | 14 +++++++-------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b54da639a650682495994e3c7b137eab4e6cb3bf..b90381ae015edf9db5aa4b8c0ace9bb5c549c37b 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -314,10 +314,10 @@ jobs:
       matrix:
         vector:
         - jobname: linux-sha256
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-reftable
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-gcc
           image: ubuntu:20.04
@@ -328,19 +328,19 @@ jobs:
           cc: gcc
           cc_package: gcc-8
         - jobname: linux-gcc-default
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-reftable-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-asan-ubsan
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-meson
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-musl
           image: alpine:latest
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 29e9056dd5010f8843e42aeae8410973c825de54..8ed3ff5f0373d70b6f609dc5292dda2dd7fd8f88 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -37,10 +37,10 @@ test:linux:
   parallel:
     matrix:
       - jobname: linux-sha256
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-reftable
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-gcc
         image: ubuntu:20.04
@@ -51,16 +51,16 @@ test:linux:
         CC: gcc
         CC_PACKAGE: gcc-8
       - jobname: linux-gcc-default
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-reftable-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-asan-ubsan
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: pedantic
         image: fedora:latest
@@ -69,7 +69,7 @@ test:linux:
       - jobname: linux32
         image: i386/ubuntu:20.04
       - jobname: linux-meson
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
   artifacts:
     paths:

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v3 10/10] ci: remove stale code for Azure Pipelines
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2025-01-07 12:30   ` [PATCH v3 09/10] ci: use latest Ubuntu release Patrick Steinhardt
@ 2025-01-07 12:30   ` Patrick Steinhardt
  9 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 12:30 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Support for Azure Pipelines has been retired in 6081d3898f (ci: retire
the Azure Pipelines definition, 2020-04-11) in favor of GitHub Actions.
Our CI library still has some infrastructure left for Azure though that
is now unused. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh                 | 21 +--------------------
 ci/print-test-failures.sh |  5 -----
 2 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 77a4aabdb8fb416c1733f02d02145b6bc0849998..4003354f16c048b969c0bb4340d2ee2777767300 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -206,26 +206,7 @@ export TERM=${TERM:-dumb}
 # Clear MAKEFLAGS that may come from the outside world.
 export MAKEFLAGS=
 
-if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
-then
-	CI_TYPE=azure-pipelines
-	# We are running in Azure Pipelines
-	CI_BRANCH="$BUILD_SOURCEBRANCH"
-	CI_COMMIT="$BUILD_SOURCEVERSION"
-	CI_JOB_ID="$BUILD_BUILDID"
-	CI_JOB_NUMBER="$BUILD_BUILDNUMBER"
-	CI_OS_NAME="$(echo "$AGENT_OS" | tr A-Z a-z)"
-	test darwin != "$CI_OS_NAME" || CI_OS_NAME=osx
-	CI_REPO_SLUG="$(expr "$BUILD_REPOSITORY_URI" : '.*/\([^/]*/[^/]*\)$')"
-	CC="${CC:-gcc}"
-
-	# use a subdirectory of the cache dir (because the file share is shared
-	# among *all* phases)
-	cache_dir="$HOME/test-cache/$SYSTEM_PHASENAME"
-
-	GIT_TEST_OPTS="--write-junit-xml"
-	JOBS=10
-elif test true = "$GITHUB_ACTIONS"
+if test true = "$GITHUB_ACTIONS"
 then
 	CI_TYPE=github-actions
 	CI_BRANCH="$GITHUB_REF"
diff --git a/ci/print-test-failures.sh b/ci/print-test-failures.sh
index 655687dd827e5b3e4d4879803b0d4499e7751380..dc910e51609cd7344b1ad03fdb4e820e47ad3a88 100755
--- a/ci/print-test-failures.sh
+++ b/ci/print-test-failures.sh
@@ -39,11 +39,6 @@ do
 		test_name="${test_name##*/}"
 		trash_dir="trash directory.$test_name"
 		case "$CI_TYPE" in
-		azure-pipelines)
-			mkdir -p failed-test-artifacts
-			mv "$trash_dir" failed-test-artifacts
-			continue
-			;;
 		github-actions)
 			mkdir -p failed-test-artifacts
 			echo "FAILED_TEST_ARTIFACTS=${TEST_OUTPUT_DIRECTORY:t}/failed-test-artifacts" >>$GITHUB_ENV

-- 
2.48.0.rc1.245.gb3e6e7acbc.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-07  2:48       ` Jeff King
@ 2025-01-07 16:02         ` Junio C Hamano
  0 siblings, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-01-07 16:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, git

Jeff King <peff@peff.net> writes:

> So I think a simpler and more robust version is just this:
>
>   {
> 	{ yes || true; } &&
> 	command_expecting_sigpipe; echo $? >status
>   } | true
>
> We'll keep producing data in "yes" until the pipe is closed. So it will
> closed before command_expecting_sigpipe even starts, and there is no
> race there. And because we're using "true" on the right-hand side of the
> pipe, nothing is read at all from the pipe. So there's no guessing about
> how much might have been read.

;-)

> Like I said, this won't help our current situation, but after having
> spent a little time on it (before realizing that) I figured it was worth
> documenting.

It is always fun to ses these clever hacks on the list.

Thanks.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-07  8:47         ` Patrick Steinhardt
  2025-01-07  8:50           ` Patrick Steinhardt
@ 2025-01-09  7:17           ` Jeff King
  2025-01-09 16:16             ` Junio C Hamano
  1 sibling, 1 reply; 69+ messages in thread
From: Jeff King @ 2025-01-09  7:17 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Phillip Wood

On Tue, Jan 07, 2025 at 09:47:43AM +0100, Patrick Steinhardt wrote:

> On Mon, Jan 06, 2025 at 09:39:04PM -0500, Jeff King wrote:
> > So I don't really see a way to do this robustly.
> 
> I think I found a way, which goes back to the inital idea of just
> generating heaps of submodules. My current version generates a submodule
> "A" with a couple of recursive submodules followed by 2.5k additional
> submodules, which overall generates ~150kB of data. This can be done
> somewhat efficiently via git-hash-object-object(1) and git-mktree(1),
> and things work with a sleep before and after the call to grep(1).

Ah, of course. I was so lost in trying to find hacks that I forgot we
could just actually convince it to send a lot of data. ;)

Your solution looks nice. It's O(1) processes, since all of the heavy
lifting is done by the long gitmodules file and tree.

I was going to suggest that you could reduce the number of submodules by
giving them large paths (or large checked-out branch names) to get more
bytes of output per submodule. But there is not really much point. What
you have should run quite quickly.

> I'm a bit torn though. The required setup is quite complex, and I wonder
> whether it is really worth it just to test this edge case. On the other
> hand it is there to cover a recent fix in 082caf527e (submodule status:
> propagate SIGPIPE, 2024-09-20), so losing the test coverage isn't all
> that great, either. And keeping the race is not an option to me, either.
> 
> So I'm inclined to go with the below version. WDYT?

Yeah, I was tempted after my last email to suggest just ditching the
test, too. :) But I think what you've written here is a good approach.
I'll look carefully over what you sent in the v3 series.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-07 12:30   ` [PATCH v3 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-09  7:33     ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2025-01-09  7:33 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Phillip Wood

On Tue, Jan 07, 2025 at 01:30:44PM +0100, Patrick Steinhardt wrote:

> The issue is caused by us using grep(1) to terminate the pipe on the
> first matching line in the recursing git-submodule(1) process. Standard
> streams are typically buffered though, so this condition is racy and may
> cause us to terminate the pipe after git-submodule(1) has already
> exited, and in that case we wouldn't see the expected signal.

This patch looks good to me overall, but I think there are a few small
inaccuracies in the commit message.

I don't think buffering is relevant here. Especially since in the
original test there isn't any buffering (the output comes from separate
recursive status processes, so each line gets its own write() call).

The race you're seeing is:

  1. git-submodule (or its child process) writes the first X/S line
     we're trying to match

  2. grep matches the line

  3a. grep exits, closing the pipe

  3b. git-submodule (or its children) writes the rest of its lines.

Steps 3a and 3b happen at the same time without any guarantees. If 3a
happens first, we get SIGPIPE. Otherwise, we don't (and the test fails).

And when git-submodule exits is not important. The critical timing is
when it does its final write(). If the pipe closes after that, even if
it is still running, it will not notice. (I admit that one is a
micro-nit, though).

> Fix the issue by generating a couple thousand nested submodules and
> matching on the first nested submodule. This ensures that the recursive
> git-submodule(1) process completely fills its stdout buffer, which makes
> subsequent writes block until the downstream consumer of the pipe either
> fully drains it or closes it.

One more micro-nit: "fully drains" is not accurate. If grep reads
another 4096 bytes, then that opens up 4096 more bytes in the pipe
buffer. And git-submodule can then write to it. Not important for our
case, since we "know" grep will not read more after matching, but will
just close. So it is really more like "block until the downstream
consumer either reads more or closes it".

That "know" is a little bit of an assumption. In particular, there is no
reason grep could not read into a 1MB buffer, consuming the whole input,
match within it, and only then exit. And then we'd be racy again. In
practice I'm not too worried about this. I'd be surprised by a buffer
bigger than 8k (which is what I saw when I strace'd it on my Linux
system), and your generated input is something like 160k. That should
fill even a generous pipe buffer plus grep input buffer.

> +	git init submodule &&
> +	(
> +		cd submodule &&
> +		test_commit initial &&
> +
> +		COMMIT=$(git rev-parse HEAD) &&
> +		for i in $(test_seq 2000)
> +		do
> +			printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" ||
> +			return 1
> +		done >gitmodules &&
> +		BLOB=$(git hash-object -w --stdin <gitmodules) &&
> +
> +		printf "100644 blob $BLOB\t.gitmodules\n" >tree &&
> +		for i in $(test_seq 2000)
> +		do
> +			printf "160000 commit $COMMIT\trecursive-submodule-path-%d\n" "$i" ||
> +			return 1
> +		done >>tree &&
> +		TREE=$(git mktree <tree) &&
> +
> +		COMMIT=$(git commit-tree "$TREE") &&
> +		git reset --hard "$COMMIT"
> +	) &&

OK, so the submodule has a huge number of missing children. But that's
enough for us, because the "-" lines generated by "submodule status" are
fine. So you're able to generate the whole thing with only printf
processes running in the loops. Very clever.

> +	git init repo &&
> +	(
> +		cd repo &&
> +		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
> +		{ git submodule status --recursive 2>err; echo $?>status; } |
> +			grep -q recursive-submodule-path-1 &&
> +		test_must_be_empty err &&
> +		test_match_signal 13 "$(cat status)"
> +	)

And then adding another repo wrapping it is what makes it recursive.
Nice.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-09  7:17           ` Jeff King
@ 2025-01-09 16:16             ` Junio C Hamano
  0 siblings, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2025-01-09 16:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Patrick Steinhardt, git, Phillip Wood

Jeff King <peff@peff.net> writes:

> On Tue, Jan 07, 2025 at 09:47:43AM +0100, Patrick Steinhardt wrote:
>
>> On Mon, Jan 06, 2025 at 09:39:04PM -0500, Jeff King wrote:
>> > So I don't really see a way to do this robustly.
>> 
>> I think I found a way, which goes back to the inital idea of just
>> generating heaps of submodules.
>> ...
> Your solution looks nice. It's O(1) processes, since all of the heavy
> lifting is done by the long gitmodules file and tree.
>
> I was going to suggest that you could reduce the number of submodules by
> giving them large paths (or large checked-out branch names) to get more
> bytes of output per submodule. But there is not really much point. What
> you have should run quite quickly.

;-)

>> I'm a bit torn though. The required setup is quite complex, and I wonder
>> whether it is really worth it just to test this edge case. On the other
>> hand it is there to cover a recent fix in 082caf527e (submodule status:
>> propagate SIGPIPE, 2024-09-20), so losing the test coverage isn't all
>> that great, either. And keeping the race is not an option to me, either.
>> 
>> So I'm inclined to go with the below version. WDYT?
>
> Yeah, I was tempted after my last email to suggest just ditching the
> test, too. :) But I think what you've written here is a good approach.
> I'll look carefully over what you sent in the v3 series.

Yeah.  Thanks, both.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 00/10] A couple of CI improvements
  2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
                   ` (12 preceding siblings ...)
  2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
@ 2025-01-10 11:31 ` Patrick Steinhardt
  2025-01-10 11:31   ` [PATCH v4 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
                     ` (11 more replies)
  13 siblings, 12 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:31 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Hi,

this patch series addresses a couple of issues I've found while
investigating flaky CI jobs. Besides two more fixes for flaky jobs it
also removes some stale code and simplifies the setup on GitHub Actions
to always use containerized jobs on Linux.

Test runs can be found for GitLab [1] and GitHub [2].

Changes in v2:

  - Expand a bit on the reasoning behind the conversion to use
    containerized jobs.
  - Fix commit message typo.
  - Properly fix the race in t7422 via pipe stuffing, as proposed by
    Peff.
  - Link to v1: https://lore.kernel.org/r/20250103-b4-pks-ci-fixes-v1-0-a9bb95dff833@pks.im

Changes in v3:

  - Another iteration on the SIGPIPE test, which should now finally plug
    the race.
  - Link to v2: https://lore.kernel.org/r/20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@pks.im

Changes in v4:

  - Improve the commit message of the SIGPIPE test commit to more
    accurately describe the race.
  - Link to v3: https://lore.kernel.org/r/20250107-b4-pks-ci-fixes-v3-0-546a0ebc8481@pks.im

Thanks!

Patrick

[1]: https://gitlab.com/gitlab-org/git/-/merge_requests/277
[2]: https://github.com/git/git/pull/1865

---
Patrick Steinhardt (10):
      t0060: fix EBUSY in MinGW when setting up runtime prefix
      t7422: fix flaky test caused by buffered stdout
      github: adapt containerized jobs to be rootless
      github: convert all Linux jobs to be containerized
      github: simplify computation of the job's distro
      gitlab-ci: remove the "linux-old" job
      gitlab-ci: add linux32 job testing against i386
      ci: stop special-casing for Ubuntu 16.04
      ci: use latest Ubuntu release
      ci: remove stale code for Azure Pipelines

 .github/workflows/main.yml  | 78 ++++++++++++++++++++++-----------------------
 .gitlab-ci.yml              | 19 ++++++-----
 ci/install-dependencies.sh  |  6 ++--
 ci/lib.sh                   | 34 +++-----------------
 ci/print-test-failures.sh   |  5 ---
 t/t0060-path-utils.sh       | 10 +++---
 t/t7422-submodule-output.sh | 43 ++++++++++++++++++++++---
 7 files changed, 100 insertions(+), 95 deletions(-)

Range-diff versus v3:

 1:  324b174988 =  1:  ca4dc636aa t0060: fix EBUSY in MinGW when setting up runtime prefix
 2:  cc095aa2b1 !  2:  d41c00feb1 t7422: fix flaky test caused by buffered stdout
    @@ Commit message
             error: last command exited with $?=1
             not ok 18 - git submodule status --recursive propagates SIGPIPE
     
    -    The issue is caused by us using grep(1) to terminate the pipe on the
    -    first matching line in the recursing git-submodule(1) process. Standard
    -    streams are typically buffered though, so this condition is racy and may
    -    cause us to terminate the pipe after git-submodule(1) has already
    -    exited, and in that case we wouldn't see the expected signal.
    +    The issue is caused by a race between git-submodule(1) and grep(1):
    +
    +      1. git-submodule(1) (or its child process) writes the first X/S line
    +         we're trying to match.
    +
    +      2. grep(1) matches the line.
    +
    +      3a. grep(1) exits, closing the pipe.
    +
    +      3b. git-submodule(1) (or its child process) writes the rest of its
    +      lines.
    +
    +    Steps 3a and 3b happen at the same time without any guarantees. If 3a
    +    happens first, we get SIGPIPE. Otherwise, we don't and the test fails.
     
         Fix the issue by generating a couple thousand nested submodules and
         matching on the first nested submodule. This ensures that the recursive
         git-submodule(1) process completely fills its stdout buffer, which makes
         subsequent writes block until the downstream consumer of the pipe either
    -    fully drains it or closes it.
    +    reads more or closes it.
     
         To verify that this works as expected one can apply the following patch
         to the preimage of this commit, which used to reliably trigger the race:
 3:  73c98e7628 =  3:  6593e3307c github: adapt containerized jobs to be rootless
 4:  4d8f2bdce7 =  4:  9997698c6e github: convert all Linux jobs to be containerized
 5:  4a987aa42e =  5:  65797c51dc github: simplify computation of the job's distro
 6:  bd44700668 =  6:  687ae0c3f2 gitlab-ci: remove the "linux-old" job
 7:  e095c6757c =  7:  974229776b gitlab-ci: add linux32 job testing against i386
 8:  f885740877 =  8:  112ea61a6b ci: stop special-casing for Ubuntu 16.04
 9:  17b19dc51e =  9:  465cd85898 ci: use latest Ubuntu release
10:  95ce4406c7 = 10:  d671ee1f7f ci: remove stale code for Azure Pipelines

---
base-commit: 1b4e9a5f8b5f048972c21fe8acafe0404096f694
change-id: 20250103-b4-pks-ci-fixes-2d0a23fb5c78


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
@ 2025-01-10 11:31   ` Patrick Steinhardt
  2025-01-10 11:31   ` [PATCH v4 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:31 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Two of our tests in t0060 verify that the runtime prefix functionality
works as expected by creating a separate directory hierarchy, copying
the Git executable in there and then creating scripts relative to that
executable.

These tests fail quite regularly in GitLab CI with the following error:

    expecting success of 0060.218 '%(prefix)/ works':
            mkdir -p pretend/bin &&
            cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
            git config yes.path "%(prefix)/yes" &&
            GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
            echo "$(pwd)/pretend/yes" >expect &&
            test_cmp expect actual
    ++ mkdir -p pretend/bin
    ++ cp /c/GitLab-Runner/builds/gitlab-org/git/git.exe pretend/bin/
    cp: cannot create regular file 'pretend/bin/git.exe': Device or resource busy
    error: last command exited with $?=1
    not ok 218 - %(prefix)/ works

Seemingly, the "git.exe" binary we are trying to overwrite is still
being held open. It is somewhat puzzling why exactly that is: while the
preceding test _does_ write to and execute the same path, it should have
exited and shouldn't keep any backgrounded processes around. So it must
be held open by something else, either in MinGW or in Windows itself.

While the root cause is puzzling, the workaround is trivial enough:
instead of writing the file twice we simply pull the common setup into a
separate test case so that we won't observe EBUSY in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t0060-path-utils.sh | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index dbb2e73bcd912ae6a804603ff54e4c609966fa5d..8545cdfab559b4e247cb2699965e637529fd930a 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -592,17 +592,19 @@ test_lazy_prereq CAN_EXEC_IN_PWD '
 	./git rev-parse
 '
 
+test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'setup runtime prefix' '
+	mkdir -p pretend/bin &&
+	cp "$GIT_EXEC_PATH"/git$X pretend/bin/
+'
+
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX works' '
-	mkdir -p pretend/bin pretend/libexec/git-core &&
+	mkdir -p pretend/libexec/git-core &&
 	echo "echo HERE" | write_script pretend/libexec/git-core/git-here &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	GIT_EXEC_PATH= ./pretend/bin/git here >actual &&
 	echo HERE >expect &&
 	test_cmp expect actual'
 
 test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' '
-	mkdir -p pretend/bin &&
-	cp "$GIT_EXEC_PATH"/git$X pretend/bin/ &&
 	git config yes.path "%(prefix)/yes" &&
 	GIT_EXEC_PATH= ./pretend/bin/git config --path yes.path >actual &&
 	echo "$(pwd)/pretend/yes" >expect &&

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
  2025-01-10 11:31   ` [PATCH v4 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
@ 2025-01-10 11:31   ` Patrick Steinhardt
  2025-01-24  9:16     ` Christian Couder
  2025-01-10 11:31   ` [PATCH v4 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
                     ` (9 subsequent siblings)
  11 siblings, 1 reply; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:31 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by a race between git-submodule(1) and grep(1):

  1. git-submodule(1) (or its child process) writes the first X/S line
     we're trying to match.

  2. grep(1) matches the line.

  3a. grep(1) exits, closing the pipe.

  3b. git-submodule(1) (or its child process) writes the rest of its
  lines.

Steps 3a and 3b happen at the same time without any guarantees. If 3a
happens first, we get SIGPIPE. Otherwise, we don't and the test fails.

Fix the issue by generating a couple thousand nested submodules and
matching on the first nested submodule. This ensures that the recursive
git-submodule(1) process completely fills its stdout buffer, which makes
subsequent writes block until the downstream consumer of the pipe either
reads more or closes it.

To verify that this works as expected one can apply the following patch
to the preimage of this commit, which used to reliably trigger the race:

    diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
    index 3c5177cc30..df6001f8a0 100755
    --- a/t/t7422-submodule-output.sh
    +++ b/t/t7422-submodule-output.sh
    @@ -202,7 +202,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
     		cd repo &&
     		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
     		{ git submodule status --recursive 2>err; echo $?>status; } |
    -			grep -q recursive-submodule-path-1 &&
    +			{ sleep 1 && grep -q recursive-submodule-path-1 && sleep 1; } &&
     		test_must_be_empty err &&
     		test_match_signal 13 "$(cat status)"
     	)

With the pipe-stuffing workaround the test runs successfully.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7422-submodule-output.sh | 43 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e9203678b94701281d5339ae8bfe53d5de0ed..023a5cbdc44bac2389fca45cf7017750627c4ce9 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,10 +167,45 @@ do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
-	test_must_be_empty err &&
-	test_match_signal 13 "$(cat status)"
+	# The test setup is somewhat involved because triggering a SIGPIPE is
+	# racy with buffered pipes. To avoid the raciness we thus need to make
+	# sure that the subprocess in question fills the buffers completely,
+	# which requires a couple thousand submodules in total.
+	test_when_finished "rm -rf submodule repo" &&
+	git init submodule &&
+	(
+		cd submodule &&
+		test_commit initial &&
+
+		COMMIT=$(git rev-parse HEAD) &&
+		for i in $(test_seq 2000)
+		do
+			printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" ||
+			return 1
+		done >gitmodules &&
+		BLOB=$(git hash-object -w --stdin <gitmodules) &&
+
+		printf "100644 blob $BLOB\t.gitmodules\n" >tree &&
+		for i in $(test_seq 2000)
+		do
+			printf "160000 commit $COMMIT\trecursive-submodule-path-%d\n" "$i" ||
+			return 1
+		done >>tree &&
+		TREE=$(git mktree <tree) &&
+
+		COMMIT=$(git commit-tree "$TREE") &&
+		git reset --hard "$COMMIT"
+	) &&
+
+	git init repo &&
+	(
+		cd repo &&
+		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
+		{ git submodule status --recursive 2>err; echo $?>status; } |
+			grep -q recursive-submodule-path-1 &&
+		test_must_be_empty err &&
+		test_match_signal 13 "$(cat status)"
+	)
 '
 
 test_done

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 03/10] github: adapt containerized jobs to be rootless
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
  2025-01-10 11:31   ` [PATCH v4 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
  2025-01-10 11:31   ` [PATCH v4 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-10 11:31   ` Patrick Steinhardt
  2025-01-24  9:56     ` Christian Couder
  2025-08-28  9:58     ` Johannes Schindelin
  2025-01-10 11:32   ` [PATCH v4 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
                     ` (8 subsequent siblings)
  11 siblings, 2 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:31 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

The containerized jobs in GitHub Actions run as root, giving them
special permissions to for example delete files even when the user
shouldn't be able to due to file permissions. This limitation keeps us
from using containerized jobs for most of our Ubuntu-based jobs as it
causes a number of tests to fail.

Adapt the jobs to create a separate user that executes the test suite.
This follows similar infrastructure that we already have in GitLab CI.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 6 ++++--
 ci/install-dependencies.sh | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 900be9957a23fcaa64e1aefd0c8638c5f84b7997..b02f5873a540b458d38e7951b4ee3d5ca598ae23 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -371,10 +371,12 @@ jobs:
       run: apt -q update && apt -q -y install libc6-amd64 lib64stdc++6
     - uses: actions/checkout@v4
     - run: ci/install-dependencies.sh
-    - run: ci/run-build-and-tests.sh
+    - run: useradd builder --create-home
+    - run: chown -R builder .
+    - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
     - name: print test failures
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
-      run: ci/print-test-failures.sh
+      run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh
     - name: Upload failed tests' directories
       if: failure() && env.FAILED_TEST_ARTIFACTS != ''
       uses: actions/upload-artifact@v4
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index d1cb9fa8785388b3674fcea4dd682abc0725c968..ecb5b9d36c20d3e7e96148ac628a96c62642c308 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -31,7 +31,7 @@ alpine-*)
 	;;
 fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
-	dnf -yq install make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
+	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
 ubuntu-*|ubuntu32-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 04/10] github: convert all Linux jobs to be containerized
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-01-10 11:31   ` [PATCH v4 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
@ 2025-01-10 11:32   ` Patrick Steinhardt
  2025-01-10 11:32   ` [PATCH v4 05/10] github: simplify computation of the job's distro Patrick Steinhardt
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

We have split the CI jobs in GitHub Workflows into two categories:

  - Those running on a machine pool directly.

  - Those running in a container on the machine pool.

The latter is more flexible because it allows us to freely pick whatever
container image we want to use for a specific job, while the former only
allows us to pick from a handful of different distros. The containerized
jobs do not have any significant downsides to the best of my knowledge:

  - They aren't significantly slower to start up. A quick comparison by
    Peff shows that the difference is mostly lost in the noise:

            job             |  old | new
        --------------------|------|------
        linux-TEST-vars      11m30s 10m54s
        linux-asan-ubsan     30m26s 31m14s
        linux-gcc             9m47s 10m6s
        linux-gcc-default     9m47s  9m41s
        linux-leaks          25m50s 25m21s
        linux-meson          10m36s 10m41s
        linux-reftable       10m25s 10m23s
        linux-reftable-leaks 27m18s 27m28s
        linux-sha256          9m54s 10m31s

    Some jobs are a bit faster, some are a bit slower, but there does
    not seem to be any significant change.

  - Containerized jobs run as root, which keeps a couple of tests from
    running. This has been addressed in the preceding commit though,
    where we now use setpriv(1) to run tests as a separate user.

  - GitHub injects a Node binary into containerized jobs, which is
    dynamically linked. This has led to some issues in the past [1], but
    only for our 32 bit jobs. The issues have since been resolved.

Overall there seem to be no downsides, but the upside is that we have
more control over the exact image that these jobs use. Convert the Linux
jobs accordingly.

[1]: https://lore.kernel.org/git/20240912094841.GD589828@coredump.intra.peff.net/

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 68 ++++++++++++++++++++++++++--------------------
 1 file changed, 39 insertions(+), 29 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b02f5873a540b458d38e7951b4ee3d5ca598ae23..8e5847da4fab009ad699c18e1a5a336a8b45c3ed 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -259,20 +259,6 @@ jobs:
       fail-fast: false
       matrix:
         vector:
-          - jobname: linux-sha256
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-reftable
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-gcc
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
-          - jobname: linux-TEST-vars
-            cc: gcc
-            cc_package: gcc-8
-            pool: ubuntu-20.04
           - jobname: osx-clang
             cc: clang
             pool: macos-13
@@ -285,21 +271,6 @@ jobs:
           - jobname: osx-meson
             cc: clang
             pool: macos-13
-          - jobname: linux-gcc-default
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-reftable-leaks
-            cc: gcc
-            pool: ubuntu-latest
-          - jobname: linux-asan-ubsan
-            cc: clang
-            pool: ubuntu-latest
-          - jobname: linux-meson
-            cc: gcc
-            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
@@ -342,6 +313,44 @@ jobs:
       fail-fast: false
       matrix:
         vector:
+        - jobname: linux-sha256
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-reftable
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-gcc
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-TEST-vars
+          image: ubuntu:20.04
+          cc: gcc
+          cc_package: gcc-8
+          distro: ubuntu-20.04
+        - jobname: linux-gcc-default
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-reftable-leaks
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
+        - jobname: linux-asan-ubsan
+          image: ubuntu:latest
+          cc: clang
+          distro: ubuntu-latest
+        - jobname: linux-meson
+          image: ubuntu:latest
+          cc: gcc
+          distro: ubuntu-latest
         - jobname: linux-musl
           image: alpine
           distro: alpine-latest
@@ -363,6 +372,7 @@ jobs:
     env:
       jobname: ${{matrix.vector.jobname}}
       distro: ${{matrix.vector.distro}}
+      CC: ${{matrix.vector.cc}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 05/10] github: simplify computation of the job's distro
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-01-10 11:32   ` [PATCH v4 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
@ 2025-01-10 11:32   ` Patrick Steinhardt
  2025-01-10 11:32   ` [PATCH v4 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

We explicitly list the distro of Linux-based jobs, but it is equivalent
to the name of the image in almost all cases, except that colons are
replaced with dashes. Drop the redundant information and massage it in
our CI scripts, which is equivalent to how we do it in GitLab CI.

There are a couple of exceptions:

  - The "linux32" job, whose distro name is different than the image
    name. This is handled by adapting all sites to use the new name.

  - The "alpine" and "fedora" jobs, neither of which specify a tag for
    their image. This is handled by adding the "latest" tag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 22 ++++------------------
 ci/install-dependencies.sh |  4 ++--
 ci/lib.sh                  |  2 ++
 3 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 8e5847da4fab009ad699c18e1a5a336a8b45c3ed..b54da639a650682495994e3c7b137eab4e6cb3bf 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -275,7 +275,7 @@ jobs:
       CC: ${{matrix.vector.cc}}
       CC_PACKAGE: ${{matrix.vector.cc_package}}
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.pool}}
+      CI_JOB_IMAGE: ${{matrix.vector.pool}}
       TEST_OUTPUT_DIRECTORY: ${{github.workspace}}/t
     runs-on: ${{matrix.vector.pool}}
     steps:
@@ -316,63 +316,49 @@ jobs:
         - jobname: linux-sha256
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-reftable
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-gcc
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-TEST-vars
           image: ubuntu:20.04
           cc: gcc
           cc_package: gcc-8
-          distro: ubuntu-20.04
         - jobname: linux-gcc-default
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-reftable-leaks
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-asan-ubsan
           image: ubuntu:latest
           cc: clang
-          distro: ubuntu-latest
         - jobname: linux-meson
           image: ubuntu:latest
           cc: gcc
-          distro: ubuntu-latest
         - jobname: linux-musl
-          image: alpine
-          distro: alpine-latest
+          image: alpine:latest
         # Supported until 2025-04-02.
         - jobname: linux32
           image: i386/ubuntu:focal
-          distro: ubuntu32-20.04
         - jobname: pedantic
-          image: fedora
-          distro: fedora-latest
+          image: fedora:latest
         # A RHEL 8 compatible distro.  Supported until 2029-05-31.
         - jobname: almalinux-8
           image: almalinux:8
-          distro: almalinux-8
         # Supported until 2026-08-31.
         - jobname: debian-11
           image: debian:11
-          distro: debian-11
     env:
       jobname: ${{matrix.vector.jobname}}
-      distro: ${{matrix.vector.distro}}
       CC: ${{matrix.vector.cc}}
+      CI_JOB_IMAGE: ${{matrix.vector.image}}
     runs-on: ubuntu-latest
     container: ${{matrix.vector.image}}
     steps:
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index ecb5b9d36c20d3e7e96148ac628a96c62642c308..d5a959e25ff3236656ff3416b81732ec5c2107c1 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -33,7 +33,7 @@ fedora-*|almalinux-*)
 	dnf -yq update >/dev/null &&
 	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
 	;;
-ubuntu-*|ubuntu32-*|debian-*)
+ubuntu-*|i386/ubuntu-*|debian-*)
 	# Required so that apt doesn't wait for user input on certain packages.
 	export DEBIAN_FRONTEND=noninteractive
 
@@ -42,7 +42,7 @@ ubuntu-*|ubuntu32-*|debian-*)
 		SVN='libsvn-perl subversion'
 		LANGUAGES='language-pack-is'
 		;;
-	ubuntu32-*)
+	i386/ubuntu-*)
 		SVN=
 		LANGUAGES='language-pack-is'
 		;;
diff --git a/ci/lib.sh b/ci/lib.sh
index 8885ee3c3f86c62e8783d27756b8779bd491e7e6..f8b68ab8a6546802756fd516ca15a2c97223da5f 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -246,6 +246,8 @@ then
 
 	GIT_TEST_OPTS="--github-workflow-markup"
 	JOBS=10
+
+	distro=$(echo "$CI_JOB_IMAGE" | tr : -)
 elif test true = "$GITLAB_CI"
 then
 	CI_TYPE=gitlab-ci

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 06/10] gitlab-ci: remove the "linux-old" job
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-01-10 11:32   ` [PATCH v4 05/10] github: simplify computation of the job's distro Patrick Steinhardt
@ 2025-01-10 11:32   ` Patrick Steinhardt
  2025-01-10 11:32   ` [PATCH v4 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

The "linux-old" job was historically testing against the oldest
supported LTS release of Ubuntu. But with c85bcb5de1 (gitlab-ci: switch
from Ubuntu 16.04 to 20.04, 2024-10-31) it has been converted to test
against Ubuntu 20.04, which already gets exercised in a couple of other
CI jobs. It's thus not adding any significant test coverage.

Drop the job.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 9254e01583306e67dc12b6b9e0015183e1108655..00bc727865031620752771af4a9030c7de1b73df 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -36,9 +36,6 @@ test:linux:
       fi
   parallel:
     matrix:
-      - jobname: linux-old
-        image: ubuntu:20.04
-        CC: gcc
       - jobname: linux-sha256
         image: ubuntu:latest
         CC: clang

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 07/10] gitlab-ci: add linux32 job testing against i386
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-01-10 11:32   ` [PATCH v4 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
@ 2025-01-10 11:32   ` Patrick Steinhardt
  2025-01-10 11:32   ` [PATCH v4 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Add another job to GitLab CI that tests against the i386 architecture.
This job is equivalent to the same job in GitHub Workflows.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .gitlab-ci.yml | 2 ++
 ci/lib.sh      | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 00bc727865031620752771af4a9030c7de1b73df..29e9056dd5010f8843e42aeae8410973c825de54 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -66,6 +66,8 @@ test:linux:
         image: fedora:latest
       - jobname: linux-musl
         image: alpine:latest
+      - jobname: linux32
+        image: i386/ubuntu:20.04
       - jobname: linux-meson
         image: ubuntu:latest
         CC: gcc
diff --git a/ci/lib.sh b/ci/lib.sh
index f8b68ab8a6546802756fd516ca15a2c97223da5f..2293849ada3b45873f80e4392ab93c65657d0f13 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -269,7 +269,7 @@ then
 		CI_OS_NAME=osx
 		JOBS=$(nproc)
 		;;
-	*,alpine:*|*,fedora:*|*,ubuntu:*)
+	*,alpine:*|*,fedora:*|*,ubuntu:*|*,i386/ubuntu:*)
 		CI_OS_NAME=linux
 		JOBS=$(nproc)
 		;;

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 08/10] ci: stop special-casing for Ubuntu 16.04
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-01-10 11:32   ` [PATCH v4 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
@ 2025-01-10 11:32   ` Patrick Steinhardt
  2025-01-10 11:32   ` [PATCH v4 09/10] ci: use latest Ubuntu release Patrick Steinhardt
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

With c85bcb5de1 (gitlab-ci: switch from Ubuntu 16.04 to 20.04,
2024-10-31) we have adapted the last CI job to stop using Ubuntu 16.04
in favor of Ubuntu 20.04. Remove the special-casing we still have in our
CI scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 2293849ada3b45873f80e4392ab93c65657d0f13..77a4aabdb8fb416c1733f02d02145b6bc0849998 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -347,14 +347,7 @@ ubuntu-*)
 	fi
 	MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/$PYTHON_PACKAGE"
 
-	case "$distro" in
-	ubuntu-16.04)
-		# Apache is too old for HTTP/2.
-		;;
-	*)
-		export GIT_TEST_HTTPD=true
-		;;
-	esac
+	export GIT_TEST_HTTPD=true
 
 	# The Linux build installs the defined dependency versions below.
 	# The OS X build installs much more recent versions, whichever

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 09/10] ci: use latest Ubuntu release
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2025-01-10 11:32   ` [PATCH v4 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
@ 2025-01-10 11:32   ` Patrick Steinhardt
  2025-01-10 11:32   ` [PATCH v4 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Both GitHub Actions and GitLab CI use the "ubuntu:latest" tag as the
default image for most jobs. This tag is somewhat misleading though, as
it does not refer to the latest release of Ubuntu, but to the latest LTS
release thereof. But as we already have a couple of jobs exercising the
oldest LTS release of Ubuntu that Git still supports, it would make more
sense to test the oldest and youngest versions of Ubuntu.

Adapt these jobs to instead use the "ubuntu:rolling" tag, which refers
to the actual latest release, which currently is Ubuntu 24.10.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 .github/workflows/main.yml | 14 +++++++-------
 .gitlab-ci.yml             | 14 +++++++-------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b54da639a650682495994e3c7b137eab4e6cb3bf..b90381ae015edf9db5aa4b8c0ace9bb5c549c37b 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -314,10 +314,10 @@ jobs:
       matrix:
         vector:
         - jobname: linux-sha256
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-reftable
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-gcc
           image: ubuntu:20.04
@@ -328,19 +328,19 @@ jobs:
           cc: gcc
           cc_package: gcc-8
         - jobname: linux-gcc-default
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-reftable-leaks
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-asan-ubsan
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: clang
         - jobname: linux-meson
-          image: ubuntu:latest
+          image: ubuntu:rolling
           cc: gcc
         - jobname: linux-musl
           image: alpine:latest
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 29e9056dd5010f8843e42aeae8410973c825de54..8ed3ff5f0373d70b6f609dc5292dda2dd7fd8f88 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -37,10 +37,10 @@ test:linux:
   parallel:
     matrix:
       - jobname: linux-sha256
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-reftable
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: linux-gcc
         image: ubuntu:20.04
@@ -51,16 +51,16 @@ test:linux:
         CC: gcc
         CC_PACKAGE: gcc-8
       - jobname: linux-gcc-default
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-reftable-leaks
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
       - jobname: linux-asan-ubsan
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: clang
       - jobname: pedantic
         image: fedora:latest
@@ -69,7 +69,7 @@ test:linux:
       - jobname: linux32
         image: i386/ubuntu:20.04
       - jobname: linux-meson
-        image: ubuntu:latest
+        image: ubuntu:rolling
         CC: gcc
   artifacts:
     paths:

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v4 10/10] ci: remove stale code for Azure Pipelines
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2025-01-10 11:32   ` [PATCH v4 09/10] ci: use latest Ubuntu release Patrick Steinhardt
@ 2025-01-10 11:32   ` Patrick Steinhardt
  2025-01-10 12:03   ` [PATCH v4 00/10] A couple of CI improvements Jeff King
  2025-01-24  9:59   ` Christian Couder
  11 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-10 11:32 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Phillip Wood

Support for Azure Pipelines has been retired in 6081d3898f (ci: retire
the Azure Pipelines definition, 2020-04-11) in favor of GitHub Actions.
Our CI library still has some infrastructure left for Azure though that
is now unused. Remove it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/lib.sh                 | 21 +--------------------
 ci/print-test-failures.sh |  5 -----
 2 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 77a4aabdb8fb416c1733f02d02145b6bc0849998..4003354f16c048b969c0bb4340d2ee2777767300 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -206,26 +206,7 @@ export TERM=${TERM:-dumb}
 # Clear MAKEFLAGS that may come from the outside world.
 export MAKEFLAGS=
 
-if test -n "$SYSTEM_COLLECTIONURI" || test -n "$SYSTEM_TASKDEFINITIONSURI"
-then
-	CI_TYPE=azure-pipelines
-	# We are running in Azure Pipelines
-	CI_BRANCH="$BUILD_SOURCEBRANCH"
-	CI_COMMIT="$BUILD_SOURCEVERSION"
-	CI_JOB_ID="$BUILD_BUILDID"
-	CI_JOB_NUMBER="$BUILD_BUILDNUMBER"
-	CI_OS_NAME="$(echo "$AGENT_OS" | tr A-Z a-z)"
-	test darwin != "$CI_OS_NAME" || CI_OS_NAME=osx
-	CI_REPO_SLUG="$(expr "$BUILD_REPOSITORY_URI" : '.*/\([^/]*/[^/]*\)$')"
-	CC="${CC:-gcc}"
-
-	# use a subdirectory of the cache dir (because the file share is shared
-	# among *all* phases)
-	cache_dir="$HOME/test-cache/$SYSTEM_PHASENAME"
-
-	GIT_TEST_OPTS="--write-junit-xml"
-	JOBS=10
-elif test true = "$GITHUB_ACTIONS"
+if test true = "$GITHUB_ACTIONS"
 then
 	CI_TYPE=github-actions
 	CI_BRANCH="$GITHUB_REF"
diff --git a/ci/print-test-failures.sh b/ci/print-test-failures.sh
index 655687dd827e5b3e4d4879803b0d4499e7751380..dc910e51609cd7344b1ad03fdb4e820e47ad3a88 100755
--- a/ci/print-test-failures.sh
+++ b/ci/print-test-failures.sh
@@ -39,11 +39,6 @@ do
 		test_name="${test_name##*/}"
 		trash_dir="trash directory.$test_name"
 		case "$CI_TYPE" in
-		azure-pipelines)
-			mkdir -p failed-test-artifacts
-			mv "$trash_dir" failed-test-artifacts
-			continue
-			;;
 		github-actions)
 			mkdir -p failed-test-artifacts
 			echo "FAILED_TEST_ARTIFACTS=${TEST_OUTPUT_DIRECTORY:t}/failed-test-artifacts" >>$GITHUB_ENV

-- 
2.48.0.rc2.279.g1de40edade.dirty


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 00/10] A couple of CI improvements
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (9 preceding siblings ...)
  2025-01-10 11:32   ` [PATCH v4 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
@ 2025-01-10 12:03   ` Jeff King
  2025-01-24  9:59   ` Christian Couder
  11 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2025-01-10 12:03 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Phillip Wood

On Fri, Jan 10, 2025 at 12:31:56PM +0100, Patrick Steinhardt wrote:

> Changes in v4:
> 
>   - Improve the commit message of the SIGPIPE test commit to more
>     accurately describe the race.

Thank you for addressing my nits. :) The result looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 02/10] t7422: fix flaky test caused by buffered stdout
  2025-01-10 11:31   ` [PATCH v4 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
@ 2025-01-24  9:16     ` Christian Couder
  0 siblings, 0 replies; 69+ messages in thread
From: Christian Couder @ 2025-01-24  9:16 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano, Phillip Wood

On Fri, Jan 10, 2025 at 12:32 PM Patrick Steinhardt <ps@pks.im> wrote:

> Fix the issue by generating a couple thousand nested submodules and
> matching on the first nested submodule. This ensures that the recursive
> git-submodule(1) process completely fills its stdout buffer,

The patch looks great to me and I like the previous discussion with
Peff about it. I just want to say that, after reading the discussion
and then this paragraph, I wondered if it would have been possible to
instead have a `test-tool submodule` helper that would behave the same
as `git submodule` except that it would call setvbuf() to reduce the
size of the stdout buffer. This might have allowed a test that didn't
need 2000 nested submodules, and thus might have been faster. No need
to change anything though.

> which makes
> subsequent writes block until the downstream consumer of the pipe either
> reads more or closes it.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 03/10] github: adapt containerized jobs to be rootless
  2025-01-10 11:31   ` [PATCH v4 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
@ 2025-01-24  9:56     ` Christian Couder
  2025-08-28  9:58     ` Johannes Schindelin
  1 sibling, 0 replies; 69+ messages in thread
From: Christian Couder @ 2025-01-24  9:56 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano, Phillip Wood

On Fri, Jan 10, 2025 at 12:34 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> The containerized jobs in GitHub Actions run as root, giving them
> special permissions to for example delete files even when the user
> shouldn't be able to due to file permissions. This limitation keeps us
> from using containerized jobs for most of our Ubuntu-based jobs as it
> causes a number of tests to fail.
>
> Adapt the jobs to create a separate user that executes the test suite.
> This follows similar infrastructure that we already have in GitLab CI.

Nit (not worth a reroll): It might help a bit to say something like:

 "This requires installing the 'sudo' and 'shadow-utils' (for
`useradd`) packages."

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 00/10] A couple of CI improvements
  2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
                     ` (10 preceding siblings ...)
  2025-01-10 12:03   ` [PATCH v4 00/10] A couple of CI improvements Jeff King
@ 2025-01-24  9:59   ` Christian Couder
  2025-01-27  7:00     ` Patrick Steinhardt
  11 siblings, 1 reply; 69+ messages in thread
From: Christian Couder @ 2025-01-24  9:59 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano, Phillip Wood

On Fri, Jan 10, 2025 at 12:34 PM Patrick Steinhardt <ps@pks.im> wrote:

> this patch series addresses a couple of issues I've found while
> investigating flaky CI jobs. Besides two more fixes for flaky jobs it
> also removes some stale code and simplifies the setup on GitHub Actions
> to always use containerized jobs on Linux.

I left a few comments but I don't think they require a reroll. This
series looks good to me too.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 00/10] A couple of CI improvements
  2025-01-24  9:59   ` Christian Couder
@ 2025-01-27  7:00     ` Patrick Steinhardt
  0 siblings, 0 replies; 69+ messages in thread
From: Patrick Steinhardt @ 2025-01-27  7:00 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Jeff King, Junio C Hamano, Phillip Wood

On Fri, Jan 24, 2025 at 10:59:04AM +0100, Christian Couder wrote:
> On Fri, Jan 10, 2025 at 12:34 PM Patrick Steinhardt <ps@pks.im> wrote:
> 
> > this patch series addresses a couple of issues I've found while
> > investigating flaky CI jobs. Besides two more fixes for flaky jobs it
> > also removes some stale code and simplifies the setup on GitHub Actions
> > to always use containerized jobs on Linux.
> 
> I left a few comments but I don't think they require a reroll. This
> series looks good to me too.

Thanks for your review!

Patrick

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 03/10] github: adapt containerized jobs to be rootless
  2025-01-10 11:31   ` [PATCH v4 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
  2025-01-24  9:56     ` Christian Couder
@ 2025-08-28  9:58     ` Johannes Schindelin
  2025-11-17 17:30       ` Johannes Schindelin
  1 sibling, 1 reply; 69+ messages in thread
From: Johannes Schindelin @ 2025-08-28  9:58 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano, Phillip Wood

Hi Patrick,

On Fri, 10 Jan 2025, Patrick Steinhardt wrote:

> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> index 900be9957a23fcaa64e1aefd0c8638c5f84b7997..b02f5873a540b458d38e7951b4ee3d5ca598ae23 100644
> --- a/.github/workflows/main.yml
> +++ b/.github/workflows/main.yml
> @@ -371,10 +371,12 @@ jobs:
>        run: apt -q update && apt -q -y install libc6-amd64 lib64stdc++6
>      - uses: actions/checkout@v4
>      - run: ci/install-dependencies.sh
> -    - run: ci/run-build-and-tests.sh
> +    - run: useradd builder --create-home
> +    - run: chown -R builder .
> +    - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh

I am afraid that this is not enough. Sure, it works as long as the tests
are passing, but the entire point of running the tests is to catch _and
debug_ when they are failing. Otherwise a lot of money and effort could be
saved simply by deleting those tests.

When the tests are failing, the detailed test logs are supposed to be
shown, but as I noticed most recently in
https://github.com/microsoft/git/actions/runs/17278881863/job/49042596457?pr=787#step:9:1933
there is a fatal error that prevents them from being shown let alone
uploaded:

  [...]
  Test Summary Report
  -------------------
  t5799-gvfs-helper.sh                             (Wstat: 256 Tests: 36 Failed: 1)
    Failed test:  25
    Non-zero exit status: 1
  Files=1040, Tests=31137, 543 wallclock secs ( 8.01 usr  2.16 sys + 611.98 cusr 1100.12 csys = 1722.27 CPU)
  Result: FAIL
  make[1]: *** [Makefile:78: prove] Error 1
  ++ cat exit.status
  make[1]: Leaving directory '/__w/git/git/t'
  make: *** [Makefile:3362: test] Error 2
  + res=2
  + rm exit.status
  + end_group 'Run tests'
  + test -n t
  + set +x
  ci/lib.sh: line 221: /__w/_temp/_runner_file_commands/set_env_cca39642-cc57-484c-b7d4-27bbd4dc8260: Permission denied
  Error: Process completed with exit code 1.

This error causes the next two steps to be skipped, the one that is
supposed to show the detailed test logs, and the one to upload the failed
tests' directories, precluding any further attempt at debugging the test
failures. Even the part of that step that is supposed to show the failed
_test case's_ logs, as a last resort, fails to show anything because it is
skipped because of that error, too.

Due to various reasons, I cannot investigate this any further. At the same
time, I suspect that you need some hack like adding the `builder` user to
some group that has write access to `/__w/_temp/` (which is most likely a
Docker volume that maps to the host's `$RUNNER_TEMP` or some such, and
therefore a `chmod` is unlikely to work, or it might lead to unintended
consequences in later steps of thw workflow) to allow the logic to perform
as desired.

Ciao,
Johannes

>      - name: print test failures
>        if: failure() && env.FAILED_TEST_ARTIFACTS != ''
> -      run: ci/print-test-failures.sh
> +      run: sudo --preserve-env --set-home --user=builder ci/print-test-failures.sh
>      - name: Upload failed tests' directories
>        if: failure() && env.FAILED_TEST_ARTIFACTS != ''
>        uses: actions/upload-artifact@v4
> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
> index d1cb9fa8785388b3674fcea4dd682abc0725c968..ecb5b9d36c20d3e7e96148ac628a96c62642c308 100755
> --- a/ci/install-dependencies.sh
> +++ b/ci/install-dependencies.sh
> @@ -31,7 +31,7 @@ alpine-*)
>  	;;
>  fedora-*|almalinux-*)
>  	dnf -yq update >/dev/null &&
> -	dnf -yq install make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
> +	dnf -yq install shadow-utils sudo make gcc findutils diffutils perl python3 gettext zlib-devel expat-devel openssl-devel curl-devel pcre2-devel >/dev/null
>  	;;
>  ubuntu-*|ubuntu32-*|debian-*)
>  	# Required so that apt doesn't wait for user input on certain packages.
> 
> -- 
> 2.48.0.rc2.279.g1de40edade.dirty
> 
> 
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 03/10] github: adapt containerized jobs to be rootless
  2025-08-28  9:58     ` Johannes Schindelin
@ 2025-11-17 17:30       ` Johannes Schindelin
  0 siblings, 0 replies; 69+ messages in thread
From: Johannes Schindelin @ 2025-11-17 17:30 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano, Phillip Wood

Hi Patrick, me again,

On Thu, 28 Aug 2025, Johannes Schindelin wrote:

> On Fri, 10 Jan 2025, Patrick Steinhardt wrote:
> 
> > diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> > index 900be9957a23fcaa64e1aefd0c8638c5f84b7997..b02f5873a540b458d38e7951b4ee3d5ca598ae23 100644
> > --- a/.github/workflows/main.yml
> > +++ b/.github/workflows/main.yml
> > @@ -371,10 +371,12 @@ jobs:
> >        run: apt -q update && apt -q -y install libc6-amd64 lib64stdc++6
> >      - uses: actions/checkout@v4
> >      - run: ci/install-dependencies.sh
> > -    - run: ci/run-build-and-tests.sh
> > +    - run: useradd builder --create-home
> > +    - run: chown -R builder .
> > +    - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh
> 
> I am afraid that this is not enough. Sure, it works as long as the tests
> are passing, but the entire point of running the tests is to catch _and
> debug_ when they are failing. Otherwise a lot of money and effort could be
> saved simply by deleting those tests.
> 
> When the tests are failing, the detailed test logs are supposed to be
> shown, but as I noticed most recently in
> https://github.com/microsoft/git/actions/runs/17278881863/job/49042596457?pr=787#step:9:1933
> there is a fatal error that prevents them from being shown let alone
> uploaded:
> 
>   [...]
>   Test Summary Report
>   -------------------
>   t5799-gvfs-helper.sh                             (Wstat: 256 Tests: 36 Failed: 1)
>     Failed test:  25
>     Non-zero exit status: 1
>   Files=1040, Tests=31137, 543 wallclock secs ( 8.01 usr  2.16 sys + 611.98 cusr 1100.12 csys = 1722.27 CPU)
>   Result: FAIL
>   make[1]: *** [Makefile:78: prove] Error 1
>   ++ cat exit.status
>   make[1]: Leaving directory '/__w/git/git/t'
>   make: *** [Makefile:3362: test] Error 2
>   + res=2
>   + rm exit.status
>   + end_group 'Run tests'
>   + test -n t
>   + set +x
>   ci/lib.sh: line 221: /__w/_temp/_runner_file_commands/set_env_cca39642-cc57-484c-b7d4-27bbd4dc8260: Permission denied
>   Error: Process completed with exit code 1.
> 
> This error causes the next two steps to be skipped, the one that is
> supposed to show the detailed test logs, and the one to upload the failed
> tests' directories, precluding any further attempt at debugging the test
> failures. Even the part of that step that is supposed to show the failed
> _test case's_ logs, as a last resort, fails to show anything because it is
> skipped because of that error, too.
> 
> Due to various reasons, I cannot investigate this any further. At the same
> time, I suspect that you need some hack like adding the `builder` user to
> some group that has write access to `/__w/_temp/` (which is most likely a
> Docker volume that maps to the host's `$RUNNER_TEMP` or some such, and
> therefore a `chmod` is unlikely to work, or it might lead to unintended
> consequences in later steps of thw workflow) to allow the logic to perform
> as desired.

I have contributed a patch for that via
https://lore.kernel.org/git/pull.2003.git.1763399064983.gitgitgadget@gmail.com/.
Unfortunately, I forgot to Cc: you, please accept my apologies for that
oversight.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2025-11-17 17:30 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-03 14:46 [PATCH 00/10] A couple of CI improvements Patrick Steinhardt
2025-01-03 14:46 ` [PATCH 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
2025-01-03 14:46 ` [PATCH 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
2025-01-03 18:17   ` Jeff King
2025-01-06 11:12     ` Patrick Steinhardt
2025-01-07  2:39       ` Jeff King
2025-01-07  8:47         ` Patrick Steinhardt
2025-01-07  8:50           ` Patrick Steinhardt
2025-01-09  7:17           ` Jeff King
2025-01-09 16:16             ` Junio C Hamano
2025-01-07  2:48       ` Jeff King
2025-01-07 16:02         ` Junio C Hamano
2025-01-03 14:46 ` [PATCH 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
2025-01-03 14:46 ` [PATCH 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
2025-01-03 18:56   ` Jeff King
2025-01-03 19:06     ` Jeff King
2025-01-06 11:12       ` Patrick Steinhardt
2025-01-03 19:16   ` Junio C Hamano
2025-01-03 14:46 ` [PATCH 05/10] github: simplify computation of the job's distro Patrick Steinhardt
2025-01-03 19:09   ` Junio C Hamano
2025-01-03 14:46 ` [PATCH 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
2025-01-03 19:12   ` Junio C Hamano
2025-01-03 14:46 ` [PATCH 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
2025-01-03 14:46 ` [PATCH 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
2025-01-03 14:46 ` [PATCH 09/10] ci: use latest Ubuntu release Patrick Steinhardt
2025-01-03 14:46 ` [PATCH 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
2025-01-03 18:57 ` [PATCH 00/10] A couple of CI improvements Jeff King
2025-01-06 11:16 ` [PATCH v2 " Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
2025-01-07  2:49     ` Jeff King
2025-01-06 11:16   ` [PATCH v2 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 05/10] github: simplify computation of the job's distro Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 09/10] ci: use latest Ubuntu release Patrick Steinhardt
2025-01-06 11:16   ` [PATCH v2 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
2025-01-07 12:30 ` [PATCH v3 00/10] A couple of CI improvements Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
2025-01-09  7:33     ` Jeff King
2025-01-07 12:30   ` [PATCH v3 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 05/10] github: simplify computation of the job's distro Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 09/10] ci: use latest Ubuntu release Patrick Steinhardt
2025-01-07 12:30   ` [PATCH v3 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
2025-01-10 11:31 ` [PATCH v4 00/10] A couple of CI improvements Patrick Steinhardt
2025-01-10 11:31   ` [PATCH v4 01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix Patrick Steinhardt
2025-01-10 11:31   ` [PATCH v4 02/10] t7422: fix flaky test caused by buffered stdout Patrick Steinhardt
2025-01-24  9:16     ` Christian Couder
2025-01-10 11:31   ` [PATCH v4 03/10] github: adapt containerized jobs to be rootless Patrick Steinhardt
2025-01-24  9:56     ` Christian Couder
2025-08-28  9:58     ` Johannes Schindelin
2025-11-17 17:30       ` Johannes Schindelin
2025-01-10 11:32   ` [PATCH v4 04/10] github: convert all Linux jobs to be containerized Patrick Steinhardt
2025-01-10 11:32   ` [PATCH v4 05/10] github: simplify computation of the job's distro Patrick Steinhardt
2025-01-10 11:32   ` [PATCH v4 06/10] gitlab-ci: remove the "linux-old" job Patrick Steinhardt
2025-01-10 11:32   ` [PATCH v4 07/10] gitlab-ci: add linux32 job testing against i386 Patrick Steinhardt
2025-01-10 11:32   ` [PATCH v4 08/10] ci: stop special-casing for Ubuntu 16.04 Patrick Steinhardt
2025-01-10 11:32   ` [PATCH v4 09/10] ci: use latest Ubuntu release Patrick Steinhardt
2025-01-10 11:32   ` [PATCH v4 10/10] ci: remove stale code for Azure Pipelines Patrick Steinhardt
2025-01-10 12:03   ` [PATCH v4 00/10] A couple of CI improvements Jeff King
2025-01-24  9:59   ` Christian Couder
2025-01-27  7:00     ` Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).