git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4][Outreachy] Introduce os-version Capability with Configurable Options
@ 2025-01-06 10:30 Usman Akinyemi
  2025-01-06 10:30 ` [PATCH 1/4] version: refactor redact_non_printables() Usman Akinyemi
                   ` (4 more replies)
  0 siblings, 5 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-06 10:30 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood

For debugging, statistical analysis, and security purposes, it can
be valuable for Git servers to know the operating system the clients
are using.

For example:
- A server noticing that a client is using an old Git version with
security issues on one platform, like macOS, could verify if the
user is indeed running macOS before sending a message to upgrade."
- Similarly, a server identifying a client that could benefit from
an upgrade (e.g., for performance reasons) could better customize the
message it sends to nudge the client to upgrade.

So let's add a new 'os-version' capability to the v2 protocol, in the
same way as the existing 'agent' capability that lets clients and servers
exchange the Git version they are running.

By default this sends similar info as `git bugreport` is already sending,
which uses uname(2). The difference is that it is sanitized in the same
way as the Git version sent by the 'agent' capability is sanitized
(by replacing characters having an ascii code less than 32 or more
than 127 with '.'). Also, it only sends the result of `uname -s` i.e
just only the operating system name (e.g "Linux").

Due to privacy issues and concerns, let's add the `transfer.advertiseOSVersion`
config option. This boolean option is enabled by default, but allows users to
disable this feature completely by setting it to "false".

To provide flexibility and customization, let also add the `osversion.command`
config option. This allows users to specify a custom command whose output will
be used as the string exchanged via the "os-version" capability. If this option
is not set, the default behavior exchanges only the operating system name,
such as "Linux" or "Windows".

Planned Feature: osversion.format
While the above configurations are already implemented, we will be introducing
an additional config option, `osversion.format`. This option would allow users
to fully customize the string sent to the other side using placeholders,
similar to how git for-each-ref uses %() syntax.

For example:
Format: "OS: %(os_name), Distro: %(distro), Arch: %(arch)"
Result: "OS: Linux, Distro: Fedora, Arch: x86_64"

We are wondering if it's worth it for placeholders to use the %()
syntax or if they could use another simpler syntax like $OS_NAME or
just OS_NAME instead of %(os_name).

Note that, due to differences between `uname(1)` (command-line
utility) and `uname(2)` (system call) outputs on Windows,
`transfer.advertiseOSVersion` is set to false on Windows during
testing. See the message part of patch 3/4 for more details.

My mentor, Christian Couder, sent a previous patch series about this
before. You can find it here 
https://lore.kernel.org/git/20240619125708.3719150-1-christian.couder@gmail.com/

Usman Akinyemi (4):
  version: refactor redact_non_printables()
  version: refactor get_uname_info()
  connect: advertise OS version
  version: introduce osversion.command config for os-version output

 Documentation/config/transfer.txt |  16 ++++
 Documentation/gitprotocol-v2.txt  |  21 +++++
 builtin/bugreport.c               |  13 +--
 connect.c                         |   3 +
 serve.c                           |  14 +++
 t/t5555-http-smart-common.sh      |  41 ++++++++-
 t/t5701-git-serve.sh              |  45 +++++++++-
 t/test-lib-functions.sh           |   8 ++
 version.c                         | 136 ++++++++++++++++++++++++++++--
 version.h                         |  13 +++
 10 files changed, 291 insertions(+), 19 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH 1/4] version: refactor redact_non_printables()
  2025-01-06 10:30 [PATCH 0/4][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
@ 2025-01-06 10:30 ` Usman Akinyemi
  2025-01-06 22:35   ` Eric Sunshine
  2025-01-06 10:30 ` [PATCH 2/4] version: refactor get_uname_info() Usman Akinyemi
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-06 10:30 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	Christian Couder

The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's also make a few small improvements:
  - use 'size_t' for 'i' instead of 'int',
  - move the declaration of 'i' inside the 'for ( ... )',
  - use strbuf_detach() to explicitly detach the string contained by
    the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/version.c b/version.c
index 4d763ab48d..78f025c808 100644
--- a/version.c
+++ b/version.c
@@ -6,6 +6,20 @@
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ * TODO: ensure consecutive non-printable characters are only replaced once
+*/
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH 2/4] version: refactor get_uname_info()
  2025-01-06 10:30 [PATCH 0/4][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  2025-01-06 10:30 ` [PATCH 1/4] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-01-06 10:30 ` Usman Akinyemi
  2025-01-06 16:04   ` Junio C Hamano
  2025-01-06 10:30 ` [PATCH 3/4] connect: advertise OS version Usman Akinyemi
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-06 10:30 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	Christian Couder

Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

We may need to refactor this function in the future if an
`osVersion.format` config option is added, but for now we only
need it to accept a "full" flag that makes it switch between providing
full OS information and providing only the OS name. The mode
providing only the OS name is needed in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c | 13 ++-----------
 version.c           | 23 +++++++++++++++++++++++
 version.h           |  7 +++++++
 3 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 7c2df035c9..e3288a86c8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -12,10 +12,10 @@
 #include "diagnose.h"
 #include "object-file.h"
 #include "setup.h"
+#include "version.h"
 
 static void get_system_info(struct strbuf *sys_info)
 {
-	struct utsname uname_info;
 	char *shell = NULL;
 
 	/* get git version from native cmd */
@@ -24,16 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	if (uname(&uname_info))
-		strbuf_addf(sys_info, _("uname() failed with error '%s' (%d)\n"),
-			    strerror(errno),
-			    errno);
-	else
-		strbuf_addf(sys_info, "%s %s %s %s\n",
-			    uname_info.sysname,
-			    uname_info.release,
-			    uname_info.version,
-			    uname_info.machine);
+	get_uname_info(sys_info, 1);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 78f025c808..44ffc4dd57 100644
--- a/version.c
+++ b/version.c
@@ -2,6 +2,7 @@
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
+#include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -47,3 +48,25 @@ const char *git_user_agent_sanitized(void)
 
 	return agent;
 }
+
+int get_uname_info(struct strbuf *buf, unsigned int full)
+{
+	struct utsname uname_info;
+
+	if (uname(&uname_info)) {
+		strbuf_addf(buf, _("uname() failed with error '%s' (%d)\n"),
+			    strerror(errno),
+			    errno);
+		return -1;
+	}
+
+	if (full)
+		strbuf_addf(buf, "%s %s %s %s\n",
+			    uname_info.sysname,
+			    uname_info.release,
+			    uname_info.version,
+			    uname_info.machine);
+	else
+		strbuf_addf(buf, "%s\n", uname_info.sysname);
+	return 0;
+}
diff --git a/version.h b/version.h
index 7c62e80577..5eb586c0bd 100644
--- a/version.h
+++ b/version.h
@@ -7,4 +7,11 @@ extern const char git_built_from_commit_string[];
 const char *git_user_agent(void);
 const char *git_user_agent_sanitized(void);
 
+/*
+  Try to get information about the system using uname(2).
+  Return -1 and put an error message into 'buf' in case of uname()
+  error. Return 0 and put uname info into 'buf' otherwise.
+*/
+int get_uname_info(struct strbuf *buf, unsigned int full);
+
 #endif /* VERSION_H */
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH 3/4] connect: advertise OS version
  2025-01-06 10:30 [PATCH 0/4][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  2025-01-06 10:30 ` [PATCH 1/4] version: refactor redact_non_printables() Usman Akinyemi
  2025-01-06 10:30 ` [PATCH 2/4] version: refactor get_uname_info() Usman Akinyemi
@ 2025-01-06 10:30 ` Usman Akinyemi
  2025-01-06 16:22   ` Junio C Hamano
  2025-01-06 23:17   ` Eric Sunshine
  2025-01-06 10:30 ` [PATCH 4/4] version: introduce osversion.command config for os-version output Usman Akinyemi
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  4 siblings, 2 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-06 10:30 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	Christian Couder

As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Let's introduce a new protocol (`os-version`) allowing Git clients and
servers to exchange operating system information. The protocol is
controlled by the new `transfer.advertiseOSVersion` config option.

Add the `transfer.advertiseOSVersion` config option to address
privacy concerns issue. It defaults to `true` and can be changed to
`false`. When enabled, this option makes clients and servers send each
other the OS name (e.g., "Linux" or "Windows"). The information is
retrieved using the 'sysname' field of the `uname(2)` system call.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

Until a good way to test the feature on Windows is found, the
transfer.advertiseOSVersion is set to false on Windows during testing.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/config/transfer.txt |  7 ++++++
 Documentation/gitprotocol-v2.txt  | 20 +++++++++++++++
 connect.c                         |  3 +++
 serve.c                           | 14 +++++++++++
 t/t5555-http-smart-common.sh      | 12 ++++++++-
 t/t5701-git-serve.sh              | 12 ++++++++-
 t/test-lib-functions.sh           |  8 ++++++
 version.c                         | 42 +++++++++++++++++++++++++++++++
 version.h                         |  6 +++++
 9 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index f1ce50f4a6..e2d95d1ccd 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -125,3 +125,10 @@ transfer.bundleURI::
 transfer.advertiseObjectInfo::
 	When `true`, the `object-info` capability is advertised by
 	servers. Defaults to false.
+
+transfer.advertiseOSVersion::
+	When `true`, the `os-version` capability is advertised by clients and
+	servers. It makes clients and servers send to each other a string
+	representing the operating system name, like "Linux" or "Windows".
+	This string is retrieved from the 'sysname' field of the struct returned
+	by the uname(2) system call. Defaults to true.
diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 1652fef3ae..c28262c60b 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -190,6 +190,26 @@ printable ASCII characters except space (i.e., the byte range 32 < x <
 and debugging purposes, and MUST NOT be used to programmatically assume
 the presence or absence of particular features.
 
+os-version
+~~~~~~~~~~
+
+In the same way as the `agent` capability above, the server can
+advertise the `os-version` capability with a value `X` (in the form
+`os-version=X`) to notify the client that the server is running an
+operating system that can be identified by `X`. The client may
+optionally send its own `os-version` string by including the
+`os-version` capability with a value `Y` (in the form `os-version=Y`)
+in its request to the server (but it MUST NOT do so if the server did
+not advertise the os-version capability). The `X` and `Y` strings may
+contain any printable ASCII characters except space (i.e., the byte
+range 32 < x < 127), and are typically made from the result of
+`uname -s`(OS name e.g Linux). The os-version capability can be disabled
+entirely by setting the `transfer.advertiseOSVersion` config option
+to `false`. The `os-version` strings are purely informative for
+statistics and debugging purposes, and MUST NOT be used to
+programmatically assume the presence or absence of particular
+features.
+
 ls-refs
 ~~~~~~~
 
diff --git a/connect.c b/connect.c
index 10fad43e98..6d5792b63c 100644
--- a/connect.c
+++ b/connect.c
@@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
 	if (server_supports_v2("agent"))
 		packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
 
+	if (server_supports_v2("os-version") && advertise_os_version(the_repository))
+		packet_write_fmt(fd_out, "os-version=%s", os_version_sanitized());
+
 	if (server_feature_v2("object-format", &hash_name)) {
 		int hash_algo = hash_algo_by_name(hash_name);
 		if (hash_algo == GIT_HASH_UNKNOWN)
diff --git a/serve.c b/serve.c
index c8694e3751..5b0d54ae9a 100644
--- a/serve.c
+++ b/serve.c
@@ -31,6 +31,16 @@ static int agent_advertise(struct repository *r UNUSED,
 	return 1;
 }
 
+static int os_version_advertise(struct repository *r,
+			   struct strbuf *value)
+{
+	if (!advertise_os_version(r))
+		return 0;
+	if (value)
+		strbuf_addstr(value, os_version_sanitized());
+	return 1;
+}
+
 static int object_format_advertise(struct repository *r,
 				   struct strbuf *value)
 {
@@ -123,6 +133,10 @@ static struct protocol_capability capabilities[] = {
 		.name = "agent",
 		.advertise = agent_advertise,
 	},
+	{
+		.name = "os-version",
+		.advertise = os_version_advertise,
+	},
 	{
 		.name = "ls-refs",
 		.advertise = ls_refs_advertise,
diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
index e47ea1ad10..f9e2a66cba 100755
--- a/t/t5555-http-smart-common.sh
+++ b/t/t5555-http-smart-common.sh
@@ -123,9 +123,19 @@ test_expect_success 'git receive-pack --advertise-refs: v1' '
 '
 
 test_expect_success 'git upload-pack --advertise-refs: v2' '
+	printf "agent=FAKE" >agent_and_os_name &&
+	if test_have_prereq WINDOWS
+	then
+		# We do not use test_config here so that any tests below can reuse
+		# the "expect" file from this test
+		git config transfer.advertiseOSVersion false
+	else
+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
+	fi &&
+
 	cat >expect <<-EOF &&
 	version 2
-	agent=FAKE
+	$(cat agent_and_os_name)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index de904c1655..f4668b7acd 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -8,13 +8,23 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 . ./test-lib.sh
 
 test_expect_success 'test capability advertisement' '
+	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_os_name &&
+	if test_have_prereq WINDOWS
+	then
+		# We do not use test_config here so that tests below will be able to reuse
+		# the expect.base and expect.trailer files
+		git config transfer.advertiseOSVersion false
+	else
+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
+	fi &&
+
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
 	cat >expect.base <<-EOF &&
 	version 2
-	agent=git/$(git version | cut -d" " -f3)
+	$(cat agent_and_os_name)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..447c698d74 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -2007,3 +2007,11 @@ test_trailing_hash () {
 		test-tool hexdump |
 		sed "s/ //g"
 }
+
+# Trim and replace each character with ascii code below 32 or above
+# 127 (included) using a dot '.' character.
+# Octal intervals \001-\040 and \177-\377
+# corresponds to decimal intervals 1-32 and 127-255
+test_redact_non_printables () {
+    tr -d "\n" | tr "[\001-\040][\177-\377]" "."
+}
diff --git a/version.c b/version.c
index 44ffc4dd57..8242baf41c 100644
--- a/version.c
+++ b/version.c
@@ -3,6 +3,7 @@
 #include "version-def.h"
 #include "strbuf.h"
 #include "gettext.h"
+#include "config.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -70,3 +71,44 @@ int get_uname_info(struct strbuf *buf, unsigned int full)
 		strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
+
+const char *os_version(void)
+{
+	static const char *os = NULL;
+
+	if (!os) {
+		struct strbuf buf = STRBUF_INIT;
+
+		get_uname_info(&buf, 0);
+		os = strbuf_detach(&buf, NULL);
+	}
+
+	return os;
+}
+
+const char *os_version_sanitized(void)
+{
+	static const char *os_sanitized = NULL;
+
+	if (!os_sanitized) {
+		struct strbuf buf = STRBUF_INIT;
+
+		strbuf_addstr(&buf, os_version());
+		redact_non_printables(&buf);
+		os_sanitized = strbuf_detach(&buf, NULL);
+	}
+
+	return os_sanitized;
+}
+
+int advertise_os_version(struct repository *r)
+{
+	static int transfer_advertise_os_version = -1;
+
+	if (transfer_advertise_os_version == -1) {
+		repo_config_get_bool(r, "transfer.advertiseosversion", &transfer_advertise_os_version);
+		/* enabled by default */
+		transfer_advertise_os_version = !!transfer_advertise_os_version;
+	}
+	return transfer_advertise_os_version;
+}
diff --git a/version.h b/version.h
index 5eb586c0bd..8167ce956a 100644
--- a/version.h
+++ b/version.h
@@ -1,6 +1,8 @@
 #ifndef VERSION_H
 #define VERSION_H
 
+struct repository;
+
 extern const char git_version_string[];
 extern const char git_built_from_commit_string[];
 
@@ -14,4 +16,8 @@ const char *git_user_agent_sanitized(void);
 */
 int get_uname_info(struct strbuf *buf, unsigned int full);
 
+const char *os_version(void);
+const char *os_version_sanitized(void);
+int advertise_os_version(struct repository *r);
+
 #endif /* VERSION_H */
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH 4/4] version: introduce osversion.command config for os-version output
  2025-01-06 10:30 [PATCH 0/4][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                   ` (2 preceding siblings ...)
  2025-01-06 10:30 ` [PATCH 3/4] connect: advertise OS version Usman Akinyemi
@ 2025-01-06 10:30 ` Usman Akinyemi
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  4 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-06 10:30 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	Christian Couder

Currently by default, the new `os-version` capability only exchange the
operating system name between servers and clients i.e "Linux" or
"Windows".

Let's introduce a new configuration option, `osversion.command`, to handle
the string exchange between servers and clients. This option allows
customization of the exchanged string by leveraging the output of the
specified command. If this is not set, the `os-version` capability
exchange just the operating system name.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/config/transfer.txt | 11 ++++++-
 Documentation/gitprotocol-v2.txt  | 13 ++++----
 t/t5555-http-smart-common.sh      | 29 ++++++++++++++++++
 t/t5701-git-serve.sh              | 33 ++++++++++++++++++++
 version.c                         | 51 ++++++++++++++++++++++++++++++-
 5 files changed, 129 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index e2d95d1ccd..28a08f21fc 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -131,4 +131,13 @@ transfer.advertiseOSVersion::
 	servers. It makes clients and servers send to each other a string
 	representing the operating system name, like "Linux" or "Windows".
 	This string is retrieved from the 'sysname' field of the struct returned
-	by the uname(2) system call. Defaults to true.
+	by the uname(2) system call. If the `osVersion.command` is set, the
+	output of the command specified will be the string exchanged by the clients
+	and the servers. Defaults to true.
+
+osVersion.command::
+	If this variable is set, the specified command will be run and the output
+	will be used as the value `X` for `os-version` capability (in the form
+	`os-version=X`). `osVersion.command` is only used if `transfer.advertiseOSVersion`
+	is true. Refer to the linkgit:git-config[1] documentation to learn more about
+	`transfer.advertiseOSVersion` config option.
diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index c28262c60b..53621c0bce 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -203,12 +203,13 @@ in its request to the server (but it MUST NOT do so if the server did
 not advertise the os-version capability). The `X` and `Y` strings may
 contain any printable ASCII characters except space (i.e., the byte
 range 32 < x < 127), and are typically made from the result of
-`uname -s`(OS name e.g Linux). The os-version capability can be disabled
-entirely by setting the `transfer.advertiseOSVersion` config option
-to `false`. The `os-version` strings are purely informative for
-statistics and debugging purposes, and MUST NOT be used to
-programmatically assume the presence or absence of particular
-features.
+`uname -s`(OS name e.g Linux).  If the `osVersion.command` is set,
+the `X` and `Y` are made from the ouput of the command specified.
+The os-version capability can be disabled entirely by setting the
+`transfer.advertiseOSVersion` config option to `false`. The `os-version`
+strings are purely informative for statistics and debugging purposes, and
+MUST NOT be used to programmatically assume the presence or absence of
+particular features.
 
 ls-refs
 ~~~~~~~
diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
index f9e2a66cba..8d5844eaf2 100755
--- a/t/t5555-http-smart-common.sh
+++ b/t/t5555-http-smart-common.sh
@@ -152,6 +152,35 @@ test_expect_success 'git upload-pack --advertise-refs: v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'git upload-pack --advertise-refs: v2 with osVersion.command config set' '
+	# test_config is used here as we are not reusing any file output from here
+	test_config osVersion.command "uname -srvm" &&
+	printf "agent=FAKE" >agent_and_long_os_name &&
+
+	if test_have_prereq !WINDOWS
+	then
+		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_os_name
+	fi &&
+
+	cat >expect <<-EOF &&
+	version 2
+	$(cat agent_and_long_os_name)
+	ls-refs=unborn
+	fetch=shallow wait-for-done
+	server-option
+	object-format=$(test_oid algo)
+	0000
+	EOF
+
+	GIT_PROTOCOL=version=2 \
+	GIT_USER_AGENT=FAKE \
+	git upload-pack --advertise-refs . >out 2>err &&
+
+	test-tool pkt-line unpack <out >actual &&
+	test_must_be_empty err &&
+	test_cmp actual expect
+'
+
 test_expect_success 'git receive-pack --advertise-refs: v2' '
 	# There is no v2 yet for receive-pack, implicit v0
 	cat >expect <<-EOF &&
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index f4668b7acd..51d99cd62c 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -41,6 +41,39 @@ test_expect_success 'test capability advertisement' '
 	test_cmp expect actual
 '
 
+test_expect_success 'test capability advertisement with osVersion.command config set' '
+	# test_config is used here as we are not reusing any file output from here
+	test_config osVersion.command "uname -srvm" &&
+	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_long_os_name &&
+
+	if test_have_prereq !WINDOWS
+	then
+		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_os_name
+	fi &&
+
+	test_oid_cache <<-EOF &&
+	wrong_algo sha1:sha256
+	wrong_algo sha256:sha1
+	EOF
+	cat >expect.base_long <<-EOF &&
+	version 2
+	$(cat agent_and_long_os_name)
+	ls-refs=unborn
+	fetch=shallow wait-for-done
+	server-option
+	object-format=$(test_oid algo)
+	EOF
+	cat >expect.trailer_long <<-EOF &&
+	0000
+	EOF
+	cat expect.base_long expect.trailer_long >expect &&
+
+	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
+		--advertise-capabilities >out &&
+	test-tool pkt-line unpack <out >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'stateless-rpc flag does not list capabilities' '
 	# Empty request
 	test-tool pkt-line pack >in <<-EOF &&
diff --git a/version.c b/version.c
index 8242baf41c..b446232898 100644
--- a/version.c
+++ b/version.c
@@ -1,9 +1,13 @@
+#define USE_THE_REPOSITORY_VARIABLE
+
 #include "git-compat-util.h"
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
 #include "gettext.h"
 #include "config.h"
+#include "run-command.h"
+#include "alias.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -72,6 +76,50 @@ int get_uname_info(struct strbuf *buf, unsigned int full)
 	return 0;
 }
 
+/*
+ * Return -1 if unable to retrieve the osversion.command config or
+ * if the command is malformed; otherwise, return 0 if successful.
+ */
+static int fill_os_version_command(struct child_process *cmd)
+{
+	const char *os_version_command;
+	const char **argv;
+	char *os_version_copy;
+	int n;
+
+	if (git_config_get_string_tmp("osversion.command", &os_version_command))
+		return -1;
+
+	os_version_copy = xstrdup(os_version_command);
+	n = split_cmdline(os_version_copy, &argv);
+
+	if (n < 0) {
+		warning(_("malformed osVersion.command config option: %s"),
+			_(split_cmdline_strerror(n)));
+		free(os_version_copy);
+		return -1;
+	}
+
+	for (int i = 0; i < n; i++)
+		strvec_push(&cmd->args, argv[i]);
+	free(os_version_copy);
+	free(argv);
+
+	return 0;
+}
+
+static int capture_os_version(struct strbuf *buf)
+{
+	struct child_process cmd = CHILD_PROCESS_INIT;
+
+	if (fill_os_version_command(&cmd))
+		return -1;
+	if (capture_command(&cmd, buf, 0))
+		return -1;
+
+	return 0;
+}
+
 const char *os_version(void)
 {
 	static const char *os = NULL;
@@ -79,7 +127,8 @@ const char *os_version(void)
 	if (!os) {
 		struct strbuf buf = STRBUF_INIT;
 
-		get_uname_info(&buf, 0);
+		if (capture_os_version(&buf))
+			get_uname_info(&buf, 0);
 		os = strbuf_detach(&buf, NULL);
 	}
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH 2/4] version: refactor get_uname_info()
  2025-01-06 10:30 ` [PATCH 2/4] version: refactor get_uname_info() Usman Akinyemi
@ 2025-01-06 16:04   ` Junio C Hamano
  2025-01-08 13:06     ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-06 16:04 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Some code from "builtin/bugreport.c" uses uname(2) to get system
> information.
>
> Let's refactor this code into a new get_uname_info() function, so
> that we can reuse it in a following commit.

This does two things: refactor and enhancement.  Shouldn't it do
pure refactoring in a single patch, with a follow-up patch that
extends it to allow the caller to hide the system details?


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-06 10:30 ` [PATCH 3/4] connect: advertise OS version Usman Akinyemi
@ 2025-01-06 16:22   ` Junio C Hamano
  2025-01-08 13:06     ` Usman Akinyemi
  2025-01-06 23:17   ` Eric Sunshine
  1 sibling, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-06 16:22 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> +
> +transfer.advertiseOSVersion::
> +	When `true`, the `os-version` capability is advertised by clients and
> +	servers. It makes clients and servers send to each other a string
> +	representing the operating system name, like "Linux" or "Windows".
> +	This string is retrieved from the 'sysname' field of the struct returned
> +	by the uname(2) system call. Defaults to true.

Shouldn't `sysname` be typeset as a literal, just like `true` and
`os-version`?

> +os-version
> +~~~~~~~~~~
> +
> +In the same way as the `agent` capability above, the server can
> +advertise the `os-version` capability with a value `X` (in the form
> +`os-version=X`) to notify the client that the server is running an
> +operating system that can be identified by `X`. The client may

Hmph.  I am not sure what's the value of mentioning 'X' here.  To me

    ... can advertise the `os-version` capability to notify the kind
    of operating system it is running on.

conveys the same thing with much fewer bytes.

> +optionally send its own `os-version` string by including the
> +`os-version` capability with a value `Y` (in the form `os-version=Y`)
> +in its request to the server (but it MUST NOT do so if the server did
> +not advertise the os-version capability). The `X` and `Y` strings may
> +contain any printable ASCII characters except space (i.e., the byte

This is misleading.  ASCII printable characters range from 33 to 126
(inclusive), but by saying "except space", the readers are led to
believe that the author of this documentation thinks ASCII 32 is
printable, too.

About 'X' and 'Y', we can just say "the value of this capability may
consist of ASCII printable characters (from 33 to 126 inclusive)" or
something.

Is there a need for a registry of canonical os-version strings?  One
reason why you would want this user-settable (as opposed to being
derived from "uname -s") is that a system that is presumably the
same in end-user perception can call itself in different names (your
Windows/MINGW64 example) and having the users set it to a string
chosen from a small repertoire, the other end would be able to
identify them more easily.  I do not think it is a necessarily a
good idea to limit what value the users can set to this
configuration variable, but at least with a published guideline on
calling various types of systems (and an explanation on the reason
why we publish such a guideline), users would make an informed
decision when picking what string to send.

> +# Trim and replace each character with ascii code below 32 or above
> +# 127 (included) using a dot '.' character.
> +# Octal intervals \001-\040 and \177-\377
> +# corresponds to decimal intervals 1-32 and 127-255
> +test_redact_non_printables () {
> +    tr -d "\n" | tr "[\001-\040][\177-\377]" "."
> +}

Just being curious.  Do we need to worry about carriage-returns not
just line-feeds, and if not why?

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 1/4] version: refactor redact_non_printables()
  2025-01-06 10:30 ` [PATCH 1/4] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-01-06 22:35   ` Eric Sunshine
  2025-01-08 12:58     ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Eric Sunshine @ 2025-01-06 22:35 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, Christian Couder

On Mon, Jan 6, 2025 at 5:37 AM Usman Akinyemi
<usmanakinyemi202@gmail.com> wrote:
> The git_user_agent_sanitized() function performs some sanitizing to
> avoid special characters being sent over the line and possibly messing
> up with the protocol or with the parsing on the other side.
>
> Let's extract this sanitizing into a new redact_non_printables() function,
> as we will want to reuse it in a following patch.
>
> For now the new redact_non_printables() function is still static as
> it's only needed locally.
>
> While at it, let's also make a few small improvements:
>   - use 'size_t' for 'i' instead of 'int',
>   - move the declaration of 'i' inside the 'for ( ... )',

Regarding the above two items...

>   - use strbuf_detach() to explicitly detach the string contained by
>     the 'buf' strbuf.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
> diff --git a/version.c b/version.c
> @@ -6,6 +6,20 @@
> +static void redact_non_printables(struct strbuf *buf)
> +{
> +       strbuf_trim(buf);
> +       for (size_t i = 0; i < buf->len; i++) {
> +               if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
> +                       buf->buf[i] = '.';
> +       }
> +}
> @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
>                 strbuf_addstr(&buf, git_user_agent());
> -               strbuf_trim(&buf);
> -               for (size_t i = 0; i < buf.len; i++) {

... the original code appears to have already been using `size_t` and
declaring the loop variable inside the `for` statement, despite what
the commit message says. So, is the commit message out of date? Or are
the patches out of order? Or something else?

> -                       if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> -                               buf.buf[i] = '.';
> -               }
> -               agent = buf.buf;
> +               redact_non_printables(&buf);
> +               agent = strbuf_detach(&buf, NULL);

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-06 10:30 ` [PATCH 3/4] connect: advertise OS version Usman Akinyemi
  2025-01-06 16:22   ` Junio C Hamano
@ 2025-01-06 23:17   ` Eric Sunshine
  2025-01-08 13:14     ` Usman Akinyemi
  1 sibling, 1 reply; 108+ messages in thread
From: Eric Sunshine @ 2025-01-06 23:17 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, Christian Couder

On Mon, Jan 6, 2025 at 5:37 AM Usman Akinyemi
<usmanakinyemi202@gmail.com> wrote:
> As some issues that can happen with a Git client can be operating system
> specific, it can be useful for a server to know which OS a client is
> using. In the same way it can be useful for a client to know which OS
> a server is using.
>
> Let's introduce a new protocol (`os-version`) allowing Git clients and
> servers to exchange operating system information. The protocol is
> controlled by the new `transfer.advertiseOSVersion` config option.
>
> Add the `transfer.advertiseOSVersion` config option to address
> privacy concerns issue. It defaults to `true` and can be changed to
> `false`. When enabled, this option makes clients and servers send each
> other the OS name (e.g., "Linux" or "Windows"). The information is
> retrieved using the 'sysname' field of the `uname(2)` system call.
>
> However, there are differences between `uname(1)` (command-line utility)
> and `uname(2)` (system call) outputs on Windows. These discrepancies
> complicate testing on Windows platforms. For example:
>   - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
>   .2024-02-14.20:17.UTC.x86_64
>   - `uname(2)` output: Windows.10.0.20348
>
> Until a good way to test the feature on Windows is found, the
> transfer.advertiseOSVersion is set to false on Windows during testing.

This is because the uname(2) you mention above is not actually
system-supplied but is instead faked up Git itself for the Git for
Windows port. See git/compat/mingw.c:uname().

The typical way to work around this sort of issue is to ensure that
you check Git against Git itself instead of checking Git against
"system". To do so, you would implement a new "test-util" command, say
`test-util uname`, in git/t/helpers/test-uname.c which internally
calls the same uname() function that other parts of Git call. Doing so
ensures consistency of output.

Whether or not it makes sense to go through that extra work for this
particular case is a different question.

> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
> diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
> @@ -123,9 +123,19 @@ test_expect_success 'git receive-pack --advertise-refs: v1' '
>  test_expect_success 'git upload-pack --advertise-refs: v2' '
> +       printf "agent=FAKE" >agent_and_os_name &&
> +       if test_have_prereq WINDOWS
> +       then
> +               # We do not use test_config here so that any tests below can reuse
> +               # the "expect" file from this test
> +               git config transfer.advertiseOSVersion false

Should this have a comment explaining why you're disabling
transfer.advertiseOSVersion, in particular that you found uname() on
Windows unreliable, thus need to disable the check for this case?

The comment you did compose exposes a fragility of the tests: in
particular that subsequent tests rely upon a side-effect of this test.
The fact that you had to include a special comment explaining the
problem argues for a cleaner solution, such as splitting out part of
this code into a separate test which comes before this one:
specifically, a "setup"-type test which creates the "expect" file
which gets reused by multiple tests.

> +       else
> +               printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
> +       fi &&
> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> @@ -8,13 +8,23 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>  test_expect_success 'test capability advertisement' '
> +       printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_os_name &&
> +       if test_have_prereq WINDOWS
> +       then
> +               # We do not use test_config here so that tests below will be able to reuse
> +               # the expect.base and expect.trailer files
> +               git config transfer.advertiseOSVersion false

Ditto.

> +       else
> +               printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
> +       fi &&

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 1/4] version: refactor redact_non_printables()
  2025-01-06 22:35   ` Eric Sunshine
@ 2025-01-08 12:58     ` Usman Akinyemi
  0 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-08 12:58 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, Christian Couder

On Tue, Jan 7, 2025 at 4:05 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Mon, Jan 6, 2025 at 5:37 AM Usman Akinyemi
> <usmanakinyemi202@gmail.com> wrote:
> > The git_user_agent_sanitized() function performs some sanitizing to
> > avoid special characters being sent over the line and possibly messing
> > up with the protocol or with the parsing on the other side.
> >
> > Let's extract this sanitizing into a new redact_non_printables() function,
> > as we will want to reuse it in a following patch.
> >
> > For now the new redact_non_printables() function is still static as
> > it's only needed locally.
> >
> > While at it, let's also make a few small improvements:
> >   - use 'size_t' for 'i' instead of 'int',
> >   - move the declaration of 'i' inside the 'for ( ... )',
>
> Regarding the above two items...
>
> >   - use strbuf_detach() to explicitly detach the string contained by
> >     the 'buf' strbuf.
> >
> > Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> > Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> > ---
> > diff --git a/version.c b/version.c
> > @@ -6,6 +6,20 @@
> > +static void redact_non_printables(struct strbuf *buf)
> > +{
> > +       strbuf_trim(buf);
> > +       for (size_t i = 0; i < buf->len; i++) {
> > +               if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
> > +                       buf->buf[i] = '.';
> > +       }
> > +}
> > @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
> >                 strbuf_addstr(&buf, git_user_agent());
> > -               strbuf_trim(&buf);
> > -               for (size_t i = 0; i < buf.len; i++) {
>
> ... the original code appears to have already been using `size_t` and
> declaring the loop variable inside the `for` statement, despite what
> the commit message says. So, is the commit message out of date? Or are
> the patches out of order? Or something else?
I just investigated what happened. Another commit already added it and
I did a rebase on top of the "master".
I did not notice it at all. The commit message is out of date.

I will update it in the next iteration.
Thank you very much.
Usman.
>
> > -                       if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> > -                               buf.buf[i] = '.';
> > -               }
> > -               agent = buf.buf;
> > +               redact_non_printables(&buf);
> > +               agent = strbuf_detach(&buf, NULL);

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-06 16:22   ` Junio C Hamano
@ 2025-01-08 13:06     ` Usman Akinyemi
  2025-01-08 16:15       ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-08 13:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

On Mon, Jan 6, 2025 at 9:52 PM Junio C Hamano <gitster@pobox.com> wrote:
>
Hi Junio,
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > +
> > +transfer.advertiseOSVersion::
> > +     When `true`, the `os-version` capability is advertised by clients and
> > +     servers. It makes clients and servers send to each other a string
> > +     representing the operating system name, like "Linux" or "Windows".
> > +     This string is retrieved from the 'sysname' field of the struct returned
> > +     by the uname(2) system call. Defaults to true.
>
> Shouldn't `sysname` be typeset as a literal, just like `true` and
> `os-version`?
I will do that in the next iteration. Thank you.
>
> > +os-version
> > +~~~~~~~~~~
> > +
> > +In the same way as the `agent` capability above, the server can
> > +advertise the `os-version` capability with a value `X` (in the form
> > +`os-version=X`) to notify the client that the server is running an
> > +operating system that can be identified by `X`. The client may
>
> Hmph.  I am not sure what's the value of mentioning 'X' here.  To me
>
>     ... can advertise the `os-version` capability to notify the kind
>     of operating system it is running on.
>
> conveys the same thing with much fewer bytes.
Yeah, it is better, I will use it in the next iteration.
>
> > +optionally send its own `os-version` string by including the
> > +`os-version` capability with a value `Y` (in the form `os-version=Y`)
> > +in its request to the server (but it MUST NOT do so if the server did
> > +not advertise the os-version capability). The `X` and `Y` strings may
> > +contain any printable ASCII characters except space (i.e., the byte
>
> This is misleading.  ASCII printable characters range from 33 to 126
> (inclusive), but by saying "except space", the readers are led to
> believe that the author of this documentation thinks ASCII 32 is
> printable, too.
Thanks for this, I will make changes in the next iteration.
>
> About 'X' and 'Y', we can just say "the value of this capability may
> consist of ASCII printable characters (from 33 to 126 inclusive)" or
> something.
>
Noted. Thank you.
> Is there a need for a registry of canonical os-version strings?  One
> reason why you would want this user-settable (as opposed to being
> derived from "uname -s") is that a system that is presumably the
> same in end-user perception can call itself in different names (your
> Windows/MINGW64 example) and having the users set it to a string
> chosen from a small repertoire, the other end would be able to
> identify them more easily.  I do not think it is a necessarily a
> good idea to limit what value the users can set to this
> configuration variable, but at least with a published guideline on
> calling various types of systems (and an explanation on the reason
> why we publish such a guideline), users would make an informed
> decision when picking what string to send.
We plan to implement another config option `osVersion.format`, which
allow users to fully customize the string sent to the other side using
placeholders,
similar to how git for-each-ref uses %() syntax. The user would be
able to set it to
the string they want i.e "Linux" or "Windows" (without any
placeholder) and would be
sent as-is. So, the `osVersion.format` should satisfy this need. I
will ensure to document
this option to tell that it can be used like this and will give a
small list of `os-version` strings
that can be used in this way.
>
> > +# Trim and replace each character with ascii code below 32 or above
> > +# 127 (included) using a dot '.' character.
> > +# Octal intervals \001-\040 and \177-\377
> > +# corresponds to decimal intervals 1-32 and 127-255
> > +test_redact_non_printables () {
> > +    tr -d "\n" | tr "[\001-\040][\177-\377]" "."
> > +}
>
> Just being curious.  Do we need to worry about carriage-returns not
> just line-feeds, and if not why?
The function `tr "[\001-\040][\177-\377]" "."` already replace the
carriage-returns with "."
the redact_non_printables() will also replace it with ".".
Carriage-returns octal code is 015 and
decimal code of 13. So, we do not need to worry about it.
>
> Thanks.
Thank you.
Usman.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 2/4] version: refactor get_uname_info()
  2025-01-06 16:04   ` Junio C Hamano
@ 2025-01-08 13:06     ` Usman Akinyemi
  0 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-08 13:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Hi Junio

On Mon, Jan 6, 2025 at 9:34 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > Some code from "builtin/bugreport.c" uses uname(2) to get system
> > information.
> >
> > Let's refactor this code into a new get_uname_info() function, so
> > that we can reuse it in a following commit.
>
> This does two things: refactor and enhancement.  Shouldn't it do
> pure refactoring in a single patch, with a follow-up patch that
> extends it to allow the caller to hide the system details?
>
Thanks for this, I will split the commit into two patches in the next iteration.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-06 23:17   ` Eric Sunshine
@ 2025-01-08 13:14     ` Usman Akinyemi
  0 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-08 13:14 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, Christian Couder

On Tue, Jan 7, 2025 at 4:47 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Mon, Jan 6, 2025 at 5:37 AM Usman Akinyemi
> <usmanakinyemi202@gmail.com> wrote:
> > As some issues that can happen with a Git client can be operating system
> > specific, it can be useful for a server to know which OS a client is
> > using. In the same way it can be useful for a client to know which OS
> > a server is using.
> >
> > Let's introduce a new protocol (`os-version`) allowing Git clients and
> > servers to exchange operating system information. The protocol is
> > controlled by the new `transfer.advertiseOSVersion` config option.
> >
> > Add the `transfer.advertiseOSVersion` config option to address
> > privacy concerns issue. It defaults to `true` and can be changed to
> > `false`. When enabled, this option makes clients and servers send each
> > other the OS name (e.g., "Linux" or "Windows"). The information is
> > retrieved using the 'sysname' field of the `uname(2)` system call.
> >
> > However, there are differences between `uname(1)` (command-line utility)
> > and `uname(2)` (system call) outputs on Windows. These discrepancies
> > complicate testing on Windows platforms. For example:
> >   - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
> >   .2024-02-14.20:17.UTC.x86_64
> >   - `uname(2)` output: Windows.10.0.20348
> >
> > Until a good way to test the feature on Windows is found, the
> > transfer.advertiseOSVersion is set to false on Windows during testing.
>
> This is because the uname(2) you mention above is not actually
> system-supplied but is instead faked up Git itself for the Git for
> Windows port. See git/compat/mingw.c:uname().
>
> The typical way to work around this sort of issue is to ensure that
> you check Git against Git itself instead of checking Git against
> "system". To do so, you would implement a new "test-util" command, say
> `test-util uname`, in git/t/helpers/test-uname.c which internally
> calls the same uname() function that other parts of Git call. Doing so
> ensures consistency of output.
>
> Whether or not it makes sense to go through that extra work for this
> particular case is a different question.
Hi Eric,

Thank you for the explanation. I will look into it.
>
> > Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> > Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> > ---
> > diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
> > @@ -123,9 +123,19 @@ test_expect_success 'git receive-pack --advertise-refs: v1' '
> >  test_expect_success 'git upload-pack --advertise-refs: v2' '
> > +       printf "agent=FAKE" >agent_and_os_name &&
> > +       if test_have_prereq WINDOWS
> > +       then
> > +               # We do not use test_config here so that any tests below can reuse
> > +               # the "expect" file from this test
> > +               git config transfer.advertiseOSVersion false
>
> Should this have a comment explaining why you're disabling
> transfer.advertiseOSVersion, in particular that you found uname() on
> Windows unreliable, thus need to disable the check for this case?
>
> The comment you did compose exposes a fragility of the tests: in
> particular that subsequent tests rely upon a side-effect of this test.
> The fact that you had to include a special comment explaining the
> problem argues for a cleaner solution, such as splitting out part of
> this code into a separate test which comes before this one:
> specifically, a "setup"-type test which creates the "expect" file
> which gets reused by multiple tests.
I will work on it and update it in the next iteration.
Thank you very much.
Usman.
>
> > +       else
> > +               printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
> > +       fi &&
> > diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> > @@ -8,13 +8,23 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> >  test_expect_success 'test capability advertisement' '
> > +       printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_os_name &&
> > +       if test_have_prereq WINDOWS
> > +       then
> > +               # We do not use test_config here so that tests below will be able to reuse
> > +               # the expect.base and expect.trailer files
> > +               git config transfer.advertiseOSVersion false
>
> Ditto.
>
> > +       else
> > +               printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
> > +       fi &&

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-08 13:06     ` Usman Akinyemi
@ 2025-01-08 16:15       ` Junio C Hamano
  2025-01-09 14:25         ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-08 16:15 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

>> Is there a need for a registry of canonical os-version strings?  One
>> reason why you would want this user-settable (as opposed to being
>> derived from "uname -s") is that a system that is presumably the
>> same in end-user perception can call itself in different names (your
>> Windows/MINGW64 example) and having the users set it to a string
>> chosen from a small repertoire, the other end would be able to
>> identify them more easily.  I do not think it is a necessarily a
>> good idea to limit what value the users can set to this
>> configuration variable, but at least with a published guideline on
>> calling various types of systems (and an explanation on the reason
>> why we publish such a guideline), users would make an informed
>> decision when picking what string to send.
>
> We plan to implement another config option `osVersion.format`, which
> allow users to fully customize the string sent to the other side using
> placeholders,

Sorry, you lost me.

I was wondering if we want to (informally at first) make it _less_
flexible, so that we can prevent people from being "creative" when
the value of being creative is negative.  Adding even more ways to
customize the string to subject the receiving/inspecting end to more
unnecessary variations to call the same thing in different names is
the last thing we want to see in that context, isn't it?

If you have "any random string goes" configuration mechanism, it is
pretty much game over.  You do not need to add an elaborate .format
mechanism to let users throw random garbage at the other side of the
connection.

>> > +# Trim and replace each character with ascii code below 32 or above
>> > +# 127 (included) using a dot '.' character.
>> > +# Octal intervals \001-\040 and \177-\377
>> > +# corresponds to decimal intervals 1-32 and 127-255
>> > +test_redact_non_printables () {
>> > +    tr -d "\n" | tr "[\001-\040][\177-\377]" "."
>> > +}
>>
>> Just being curious.  Do we need to worry about carriage-returns not
>> just line-feeds, and if not why?
> The function `tr "[\001-\040][\177-\377]" "."` already replace the
> carriage-returns with "."

That is exactly my point.  LF are stripped; I do not see a sensible
reason why shouldn't CR be removed the same way.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-08 16:15       ` Junio C Hamano
@ 2025-01-09 14:25         ` Usman Akinyemi
  2025-01-09 15:46           ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-09 14:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

On Wed, Jan 8, 2025 at 9:45 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> >> Is there a need for a registry of canonical os-version strings?  One
> >> reason why you would want this user-settable (as opposed to being
> >> derived from "uname -s") is that a system that is presumably the
> >> same in end-user perception can call itself in different names (your
> >> Windows/MINGW64 example) and having the users set it to a string
> >> chosen from a small repertoire, the other end would be able to
> >> identify them more easily.  I do not think it is a necessarily a
> >> good idea to limit what value the users can set to this
> >> configuration variable, but at least with a published guideline on
> >> calling various types of systems (and an explanation on the reason
> >> why we publish such a guideline), users would make an informed
> >> decision when picking what string to send.
> >
> > We plan to implement another config option `osVersion.format`, which
> > allow users to fully customize the string sent to the other side using
> > placeholders,
>
> Sorry, you lost me.
>
> I was wondering if we want to (informally at first) make it _less_
> flexible, so that we can prevent people from being "creative" when
> the value of being creative is negative.  Adding even more ways to
> customize the string to subject the receiving/inspecting end to more
> unnecessary variations to call the same thing in different names is
> the last thing we want to see in that context, isn't it?
>
> If you have "any random string goes" configuration mechanism, it is
> pretty much game over.  You do not need to add an elaborate .format
> mechanism to let users throw random garbage at the other side of the
> connection.
Thanks for the explanation.
Instead of having .format that will allow user to have multiple
variation or different placeholder,
we can allow it to take only specific values for examples:
- "full" which would mean the same thing as  the result of `uname -srvm`,
- "default" or "short" which would mean the same thing as  the result
of `uname -s`,
- "medium"  which would mean the same thing as  the result of `uname -sr`.

What is your thought about this ?

Thank you.
>
> >> > +# Trim and replace each character with ascii code below 32 or above
> >> > +# 127 (included) using a dot '.' character.
> >> > +# Octal intervals \001-\040 and \177-\377
> >> > +# corresponds to decimal intervals 1-32 and 127-255
> >> > +test_redact_non_printables () {
> >> > +    tr -d "\n" | tr "[\001-\040][\177-\377]" "."
> >> > +}
> >>
> >> Just being curious.  Do we need to worry about carriage-returns not
> >> just line-feeds, and if not why?
> > The function `tr "[\001-\040][\177-\377]" "."` already replace the
> > carriage-returns with "."
>
> That is exactly my point.  LF are stripped; I do not see a sensible
> reason why shouldn't CR be removed the same way.
Yeah, I will add that in the next iteration.

Thank you.
Usman.
>
> Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-09 14:25         ` Usman Akinyemi
@ 2025-01-09 15:46           ` Junio C Hamano
  2025-01-10 17:56             ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-09 15:46 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Instead of having .format that will allow user to have multiple
> variation or different placeholder,
> we can allow it to take only specific values for examples:
> - "full" which would mean the same thing as  the result of `uname -srvm`,
> - "default" or "short" which would mean the same thing as  the result
> of `uname -s`,
> - "medium"  which would mean the same thing as  the result of `uname -sr`.
>
> What is your thought about this ?

I think two-level is good enough.  One level is "yes, please give
the minimum that would not offend even the privacy-conscious folks
(like 'Linux', 'macOS', 'Windows' etc.)" or "no, please do not show
os-version at all".  The other is "Please use this exact string."
We do not need anything more elaborate.

The reasoning behind this conclusion goes like this.

First of all, I mentioned "registry of canonical os-version strings"
to help the users of the "Please use this string" so their server do
not have to suffer from different names and spellings to identify
the same class of clients.

But the server operators that *want* such tighter control *and* are
capable of enforcing their choice to their users are probably $CORP
in-house operators.  They can tell their employees what string to
use, or they may even do that in /etc/gitconfig on the machines they
give to their users.  In other words, they do not need our help at
all.

At least that is my thought.  Others may have different opinions.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-09 15:46           ` Junio C Hamano
@ 2025-01-10 17:56             ` Usman Akinyemi
  2025-01-10 19:24               ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-10 17:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

On Thu, Jan 9, 2025 at 9:16 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > Instead of having .format that will allow user to have multiple
> > variation or different placeholder,
> > we can allow it to take only specific values for examples:
> > - "full" which would mean the same thing as  the result of `uname -srvm`,
> > - "default" or "short" which would mean the same thing as  the result
> > of `uname -s`,
> > - "medium"  which would mean the same thing as  the result of `uname -sr`.
> >
> > What is your thought about this ?
>
> I think two-level is good enough.  One level is "yes, please give
> the minimum that would not offend even the privacy-conscious folks
> (like 'Linux', 'macOS', 'Windows' etc.)" or "no, please do not show
> os-version at all".  The other is "Please use this exact string."
> We do not need anything more elaborate.
>
> The reasoning behind this conclusion goes like this.
>
> First of all, I mentioned "registry of canonical os-version strings"
> to help the users of the "Please use this string" so their server do
> not have to suffer from different names and spellings to identify
> the same class of clients.
>
> But the server operators that *want* such tighter control *and* are
> capable of enforcing their choice to their users are probably $CORP
> in-house operators.  They can tell their employees what string to
> use, or they may even do that in /etc/gitconfig on the machines they
> give to their users.  In other words, they do not need our help at
> all.
>
> At least that is my thought.  Others may have different opinions.
Hi Junio,

Thanks for this.

So instead of having a .format config, we should have a .string config
which just
takes a string and uses it as the value for the `os-version` capability ?

Thank you.
Usman.
>
> Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-10 17:56             ` Usman Akinyemi
@ 2025-01-10 19:24               ` Junio C Hamano
  2025-01-11 11:07                 ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-10 19:24 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

>> First of all, I mentioned "registry of canonical os-version strings"
>> to help the users of the "Please use this string" so their server do
>> not have to suffer from different names and spellings to identify
>> the same class of clients.
>>
>> But the server operators that *want* such tighter control *and* are
>> capable of enforcing their choice to their users are probably $CORP
>> in-house operators.  They can tell their employees what string to
>> use, or they may even do that in /etc/gitconfig on the machines they
>> give to their users.  In other words, they do not need our help at
>> all.
>>
>> At least that is my thought.  Others may have different opinions.
> Hi Junio,
>
> Thanks for this.
>
> So instead of having a .format config, we should have a .string config
> which just
> takes a string and uses it as the value for the `os-version` capability ?

Ah, sorry, I totally misread your patch.  I somehow thought you _already_
have the "any string goes" variant implemented in the patch being reviewed.

If there isn't any such thing, then my preference is add neither of
the configuration knobs and let the system provided function give a
not-too-specific os-version string (like "Linux").  Once people gain
experiences with that feature, then we will learn more about what
degree of customizability is required.

Sorry for the confusion.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-10 19:24               ` Junio C Hamano
@ 2025-01-11 11:07                 ` Usman Akinyemi
  2025-01-13 15:46                   ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-11 11:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

On Sat, Jan 11, 2025 at 12:54 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> >> First of all, I mentioned "registry of canonical os-version strings"
> >> to help the users of the "Please use this string" so their server do
> >> not have to suffer from different names and spellings to identify
> >> the same class of clients.
> >>
> >> But the server operators that *want* such tighter control *and* are
> >> capable of enforcing their choice to their users are probably $CORP
> >> in-house operators.  They can tell their employees what string to
> >> use, or they may even do that in /etc/gitconfig on the machines they
> >> give to their users.  In other words, they do not need our help at
> >> all.
> >>
> >> At least that is my thought.  Others may have different opinions.
> > Hi Junio,
> >
> > Thanks for this.
> >
> > So instead of having a .format config, we should have a .string config
> > which just
> > takes a string and uses it as the value for the `os-version` capability ?
>
> Ah, sorry, I totally misread your patch.  I somehow thought you _already_
> have the "any string goes" variant implemented in the patch being reviewed.
>
> If there isn't any such thing, then my preference is add neither of
> the configuration knobs and let the system provided function give a
> not-too-specific os-version string (like "Linux").  Once people gain
> experiences with that feature, then we will learn more about what
> degree of customizability is required.
>
> Sorry for the confusion.
Hi Junio,

Thanks for this.

Actually, in this patch series, there is a config option called
`osVersion.command`
The specified command will be run and the output will be used as the
value for `os-version`
capability. This option was particularly asked by Randall S. Becker in
a previous
conversation https://lore.kernel.org/git/000a01dac25c$df7b23e0$9e716ba0$@nexbridge.com/

Thank you.
Usman

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-11 11:07                 ` Usman Akinyemi
@ 2025-01-13 15:46                   ` Junio C Hamano
  2025-01-13 18:26                     ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-13 15:46 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Actually, in this patch series, there is a config option called
> `osVersion.command`
> The specified command will be run and the output will be used as the
> value for `os-version`
> capability.

That is essentially a "you can throw at us any arbitrary string".
So my recommendation would not change.  .format would not give us
much _additional_ value in such a case.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-13 15:46                   ` Junio C Hamano
@ 2025-01-13 18:26                     ` Usman Akinyemi
  2025-01-13 19:47                       ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-13 18:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

On Mon, Jan 13, 2025 at 9:16 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > Actually, in this patch series, there is a config option called
> > `osVersion.command`
> > The specified command will be run and the output will be used as the
> > value for `os-version`
> > capability.
>
> That is essentially a "you can throw at us any arbitrary string".
> So my recommendation would not change.  .format would not give us
> much _additional_ value in such a case.
Hi Junio,

Thanks for this.  So, from what I understand, the feature and config
option introduced by
this patch series is enough, no need to introduce another .format
config. Right ?

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH 3/4] connect: advertise OS version
  2025-01-13 18:26                     ` Usman Akinyemi
@ 2025-01-13 19:47                       ` Junio C Hamano
  2025-01-13 20:07                         ` rsbecker
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-13 19:47 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> On Mon, Jan 13, 2025 at 9:16 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>>
>> > Actually, in this patch series, there is a config option called
>> > `osVersion.command`
>> > The specified command will be run and the output will be used as the
>> > value for `os-version`
>> > capability.
>>
>> That is essentially a "you can throw at us any arbitrary string".
>> So my recommendation would not change.  .format would not give us
>> much _additional_ value in such a case.
> Hi Junio,
>
> Thanks for this.  So, from what I understand, the feature and config
> option introduced by
> this patch series is enough, no need to introduce another .format
> config. Right ?

Yup.

At least until we and our userbase gain more experience with the
feature.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* RE: [PATCH 3/4] connect: advertise OS version
  2025-01-13 19:47                       ` Junio C Hamano
@ 2025-01-13 20:07                         ` rsbecker
  0 siblings, 0 replies; 108+ messages in thread
From: rsbecker @ 2025-01-13 20:07 UTC (permalink / raw)
  To: 'Junio C Hamano', 'Usman Akinyemi'
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, 'Christian Couder'

On January 13, 2025 2:47 PM, Junio C Hamano wrote:
>Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
>> On Mon, Jan 13, 2025 at 9:16 PM Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>>>
>>> > Actually, in this patch series, there is a config option called
>>> > `osVersion.command` The specified command will be run and the
>>> > output will be used as the value for `os-version` capability.
>>>
>>> That is essentially a "you can throw at us any arbitrary string".
>>> So my recommendation would not change.  .format would not give us
>>> much _additional_ value in such a case.
>> Hi Junio,
>>
>> Thanks for this.  So, from what I understand, the feature and config
>> option introduced by this patch series is enough, no need to introduce
>> another .format config. Right ?
>
>Yup.
>
>At least until we and our userbase gain more experience with the feature.

My thought on this is somehow relating uname -? (something) with the format
code. Options to uname are not always standard and there are extensions, so
there might be some use in having a binding for non-standard stuff. For example,
on NonStop, uname -r returns the major OS level, and uname -v returns the
minor level. My thought on this is .format R=uname -r, .format V=uname -v, or
something like that. But as you said, it will take time for us to get experience
with this.

--Randall


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-06 10:30 [PATCH 0/4][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                   ` (3 preceding siblings ...)
  2025-01-06 10:30 ` [PATCH 4/4] version: introduce osversion.command config for os-version output Usman Akinyemi
@ 2025-01-17 10:46 ` Usman Akinyemi
  2025-01-17 10:46   ` [PATCH v2 1/6] version: refactor redact_non_printables() Usman Akinyemi
                     ` (6 more replies)
  4 siblings, 7 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-17 10:46 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	sunshine, rsbecker

For debugging, statistical analysis, and security purposes, it can
be valuable for Git servers to know the operating system the clients
are using.

For example:
- A server noticing that a client is using an old Git version with
security issues on one platform, like macOS, could verify if the
user is indeed running macOS before sending a message to upgrade."
- Similarly, a server identifying a client that could benefit from
an upgrade (e.g., for performance reasons) could better customize the
message it sends to nudge the client to upgrade.

So let's add a new 'os-version' capability to the v2 protocol, in the
same way as the existing 'agent' capability that lets clients and servers
exchange the Git version they are running.

By default this sends similar info as `git bugreport` is already sending,
which uses uname(2). The difference is that it is sanitized in the same
way as the Git version sent by the 'agent' capability is sanitized
(by replacing characters having an ascii code less than 32 or more
than 127 with '.'). Also, it only sends the result of `uname -s` i.e
just only the operating system name (e.g "Linux").

Due to privacy issues and concerns, let's add the `transfer.advertiseOSVersion`
config option. This boolean option is enabled by default, but allows users to
disable this feature completely by setting it to "false".

To provide flexibility and customization, let also add the `osversion.command`
config option. This allows users to specify a custom command whose output will
be used as the string exchanged via the "os-version" capability. If this option
is not set, the default behavior exchanges only the operating system name,
such as "Linux" or "Windows". This option was particularly suggested by Randall S. Becker
in a previous conversation. You can find the reference here
https://lore.kernel.org/git/000a01dac25c$df7b23e0$9e716ba0$@nexbridge.com/

Note that, due to differences between `uname(1)` (command-line
utility) and `uname(2)` (system call) outputs on Windows,
`transfer.advertiseOSVersion` is set to false on Windows during
testing. See the message part of patch 5/6 for more details.

My mentor, Christian Couder, sent a previous patch series about this
before. You can find it here
https://lore.kernel.org/git/20240619125708.3719150-1-christian.couder@gmail.com/

Changes since v1
================
  - Refactored documentation for improved clarity.
  - Splitted patch "refactor get_uname_info()" into two patches with first
    part doing refactoring and the second part doing enhancement for code
    clearity and cleanliness.
  - Made test_redact_non_printables() to trim carriage-returns.
  - Fixed outdated commit message.
  - Splitted part of the "test capability advertisement" into a setup"-type
    to remove side-effect dependency.
  - Changed the name of some created files used in testing for better
    clearity of what their content is.
  - Added comment to os_version(), os_version_sanitized() and advertise_os_version()
    for improved clarity of what they do.

Usman Akinyemi (6):
  version: refactor redact_non_printables()
  version: refactor get_uname_info()
  version: extend get_uname_info() to hide system details
  t5701: add setup test to remove side-effect dependency
  connect: advertise OS version
  version: introduce osversion.command config for os-version output

 Documentation/config/transfer.txt |  16 ++++
 Documentation/gitprotocol-v2.txt  |  17 ++++
 builtin/bugreport.c               |  13 +--
 connect.c                         |   3 +
 serve.c                           |  14 ++++
 t/t5555-http-smart-common.sh      |  38 ++++++++-
 t/t5701-git-serve.sh              |  59 ++++++++++++-
 t/test-lib-functions.sh           |   8 ++
 version.c                         | 135 ++++++++++++++++++++++++++++--
 version.h                         |  28 +++++++
 10 files changed, 309 insertions(+), 22 deletions(-)

Range-diff versus v1:

1:  d23091031c ! 1:  97bccab6d5 version: refactor redact_non_printables()
    @@ Commit message
         For now the new redact_non_printables() function is still static as
         it's only needed locally.
     
    -    While at it, let's also make a few small improvements:
    -      - use 'size_t' for 'i' instead of 'int',
    -      - move the declaration of 'i' inside the 'for ( ... )',
    -      - use strbuf_detach() to explicitly detach the string contained by
    -        the 'buf' strbuf.
    +    While at it, let's use strbuf_detach() to explicitly detach the string
    +    contained by the 'buf' strbuf.
     
         Mentored-by: Christian Couder <chriscool@tuxfamily.org>
         Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
2:  1336622be9 ! 2:  1f8a4024a4 version: refactor get_uname_info()
    @@ Commit message
         Let's refactor this code into a new get_uname_info() function, so
         that we can reuse it in a following commit.
     
    -    We may need to refactor this function in the future if an
    -    `osVersion.format` config option is added, but for now we only
    -    need it to accept a "full" flag that makes it switch between providing
    -    full OS information and providing only the OS name. The mode
    -    providing only the OS name is needed in a following commit
    -
         Mentored-by: Christian Couder <chriscool@tuxfamily.org>
         Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
     
    @@ builtin/bugreport.c: static void get_system_info(struct strbuf *sys_info)
     -			    uname_info.release,
     -			    uname_info.version,
     -			    uname_info.machine);
    -+	get_uname_info(sys_info, 1);
    ++	get_uname_info(sys_info);
      
      	strbuf_addstr(sys_info, _("compiler info: "));
      	get_compiler_info(sys_info);
    @@ version.c: const char *git_user_agent_sanitized(void)
      	return agent;
      }
     +
    -+int get_uname_info(struct strbuf *buf, unsigned int full)
    ++int get_uname_info(struct strbuf *buf)
     +{
     +	struct utsname uname_info;
     +
    @@ version.c: const char *git_user_agent_sanitized(void)
     +		return -1;
     +	}
     +
    -+	if (full)
    -+		strbuf_addf(buf, "%s %s %s %s\n",
    -+			    uname_info.sysname,
    -+			    uname_info.release,
    -+			    uname_info.version,
    -+			    uname_info.machine);
    -+	else
    -+		strbuf_addf(buf, "%s\n", uname_info.sysname);
    ++	strbuf_addf(buf, "%s %s %s %s\n",
    ++		    uname_info.sysname,
    ++		    uname_info.release,
    ++		    uname_info.version,
    ++		    uname_info.machine);
     +	return 0;
     +}
     
    @@ version.h: extern const char git_built_from_commit_string[];
     +  Return -1 and put an error message into 'buf' in case of uname()
     +  error. Return 0 and put uname info into 'buf' otherwise.
     +*/
    -+int get_uname_info(struct strbuf *buf, unsigned int full);
    ++int get_uname_info(struct strbuf *buf);
     +
      #endif /* VERSION_H */
-:  ---------- > 3:  962b42702f version: extend get_uname_info() to hide system details
-:  ---------- > 4:  7f0ec75a0d t5701: add setup test to remove side-effect dependency
3:  b90a24813f ! 5:  499eda49cf connect: advertise OS version
    @@ Commit message
         controlled by the new `transfer.advertiseOSVersion` config option.
     
         Add the `transfer.advertiseOSVersion` config option to address
    -    privacy concerns issue. It defaults to `true` and can be changed to
    +    privacy concerns. It defaults to `true` and can be changed to
         `false`. When enabled, this option makes clients and servers send each
         other the OS name (e.g., "Linux" or "Windows"). The information is
         retrieved using the 'sysname' field of the `uname(2)` system call.
    @@ Commit message
           .2024-02-14.20:17.UTC.x86_64
           - `uname(2)` output: Windows.10.0.20348
     
    -    Until a good way to test the feature on Windows is found, the
    -    transfer.advertiseOSVersion is set to false on Windows during testing.
    +    On Windows, uname(2) is not actually system-supplied but is instead
    +    already faked up by Git itself. We could have overcome the test issue
    +    on Windows by implementing a new `uname` subcommand in `test-tool`
    +    using uname(2), but except uname(2), which would be tested against
    +    itself, there would be nothing platform specific, so it's just simpler
    +    to disable the tests on Windows.
     
         Mentored-by: Christian Couder <chriscool@tuxfamily.org>
         Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
    @@ Documentation/config/transfer.txt: transfer.bundleURI::
     +	When `true`, the `os-version` capability is advertised by clients and
     +	servers. It makes clients and servers send to each other a string
     +	representing the operating system name, like "Linux" or "Windows".
    -+	This string is retrieved from the 'sysname' field of the struct returned
    ++	This string is retrieved from the `sysname` field of the struct returned
     +	by the uname(2) system call. Defaults to true.
     
      ## Documentation/gitprotocol-v2.txt ##
    @@ Documentation/gitprotocol-v2.txt: printable ASCII characters except space (i.e.,
     +~~~~~~~~~~
     +
     +In the same way as the `agent` capability above, the server can
    -+advertise the `os-version` capability with a value `X` (in the form
    -+`os-version=X`) to notify the client that the server is running an
    -+operating system that can be identified by `X`. The client may
    -+optionally send its own `os-version` string by including the
    -+`os-version` capability with a value `Y` (in the form `os-version=Y`)
    -+in its request to the server (but it MUST NOT do so if the server did
    -+not advertise the os-version capability). The `X` and `Y` strings may
    -+contain any printable ASCII characters except space (i.e., the byte
    -+range 32 < x < 127), and are typically made from the result of
    ++advertise the `os-version` capability to notify the client the
    ++kind of operating system it is running on. The client may optionally
    ++send its own `os-version` capability, to notify the server the kind of
    ++operating system it is also running on in its request to the server
    ++(but it MUST NOT do so if the server did not advertise the os-version
    ++capability). The value of this capability may consist of ASCII printable
    ++characters(from 33 to 126 inclusive) and are typically made from the result of
     +`uname -s`(OS name e.g Linux). The os-version capability can be disabled
     +entirely by setting the `transfer.advertiseOSVersion` config option
     +to `false`. The `os-version` strings are purely informative for
    @@ t/t5555-http-smart-common.sh: test_expect_success 'git receive-pack --advertise-
      '
      
      test_expect_success 'git upload-pack --advertise-refs: v2' '
    -+	printf "agent=FAKE" >agent_and_os_name &&
    ++	printf "agent=FAKE" >agent_and_osversion &&
     +	if test_have_prereq WINDOWS
     +	then
    -+		# We do not use test_config here so that any tests below can reuse
    -+		# the "expect" file from this test
     +		git config transfer.advertiseOSVersion false
     +	else
    -+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
    ++		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
     +	fi &&
     +
      	cat >expect <<-EOF &&
      	version 2
     -	agent=FAKE
    -+	$(cat agent_and_os_name)
    ++	$(cat agent_and_osversion)
      	ls-refs=unborn
      	fetch=shallow wait-for-done
      	server-option
     
      ## t/t5701-git-serve.sh ##
    -@@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
    - . ./test-lib.sh
    - 
    - test_expect_success 'test capability advertisement' '
    -+	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_os_name &&
    +@@ t/t5701-git-serve.sh: test_expect_success 'setup to generate files with expected content' '
    + 	cat >expect.trailer <<-EOF &&
    + 	0000
    + 	EOF
    ++
     +	if test_have_prereq WINDOWS
     +	then
    -+		# We do not use test_config here so that tests below will be able to reuse
    -+		# the expect.base and expect.trailer files
     +		git config transfer.advertiseOSVersion false
     +	else
    -+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_os_name
    ++		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
     +	fi &&
     +
    - 	test_oid_cache <<-EOF &&
    - 	wrong_algo sha1:sha256
    - 	wrong_algo sha256:sha1
    ++	cat >expect_osversion.base <<-EOF
    ++	version 2
    ++	$(cat agent_and_osversion)
    ++	ls-refs=unborn
    ++	fetch=shallow wait-for-done
    ++	server-option
    ++	object-format=$(test_oid algo)
    ++	EOF
    + '
    + 
    + test_expect_success 'test capability advertisement' '
    +-	cat expect.base expect.trailer >expect &&
    ++	cat expect_osversion.base expect.trailer >expect &&
    + 
    + 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
    + 		--advertise-capabilities >out &&
    +@@ t/t5701-git-serve.sh: test_expect_success 'test capability advertisement with uploadpack.advertiseBund
    + 	cat >expect.extra <<-EOF &&
    + 	bundle-uri
      	EOF
    - 	cat >expect.base <<-EOF &&
    - 	version 2
    --	agent=git/$(git version | cut -d" " -f3)
    -+	$(cat agent_and_os_name)
    - 	ls-refs=unborn
    - 	fetch=shallow wait-for-done
    - 	server-option
    +-	cat expect.base \
    ++	cat expect_osversion.base \
    + 	    expect.extra \
    + 	    expect.trailer >expect &&
    + 
     
      ## t/test-lib-functions.sh ##
     @@ t/test-lib-functions.sh: test_trailing_hash () {
    @@ t/test-lib-functions.sh: test_trailing_hash () {
     +# Octal intervals \001-\040 and \177-\377
     +# corresponds to decimal intervals 1-32 and 127-255
     +test_redact_non_printables () {
    -+    tr -d "\n" | tr "[\001-\040][\177-\377]" "."
    ++    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
     +}
     
      ## version.c ##
    @@ version.c
      const char git_version_string[] = GIT_VERSION;
      const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
     @@ version.c: int get_uname_info(struct strbuf *buf, unsigned int full)
    - 		strbuf_addf(buf, "%s\n", uname_info.sysname);
    + 	     strbuf_addf(buf, "%s\n", uname_info.sysname);
      	return 0;
      }
     +
    @@ version.h: const char *git_user_agent_sanitized(void);
      */
      int get_uname_info(struct strbuf *buf, unsigned int full);
      
    ++/*
    ++  Retrieve and cache system information for subsequent calls.
    ++  Return a pointer to the cached system information string.
    ++*/
     +const char *os_version(void);
    ++
    ++/*
    ++  Retrieve system information string from os_version(). Then
    ++  sanitize and cache it. Return a pointer to the sanitized
    ++  system information string.
    ++*/
     +const char *os_version_sanitized(void);
    ++
    ++/*
    ++  Retrieve and cache whether os-version capability is enabled.
    ++  Return 1 if enabled, 0 if disabled.
    ++*/
     +int advertise_os_version(struct repository *r);
     +
      #endif /* VERSION_H */
4:  745e63060e ! 6:  a1637dc7cf version: introduce osversion.command config for os-version output
    @@ Commit message
         Let's introduce a new configuration option, `osversion.command`, to handle
         the string exchange between servers and clients. This option allows
         customization of the exchanged string by leveraging the output of the
    -    specified command. If this is not set, the `os-version` capability
    -    exchange just the operating system name.
    +    specified command. This customization might be especially useful on some
    +    quite uncommon platforms like NonStop where interesting OS information is
    +    available from other means than uname(2).
     
    +    If this new configuration option is not set, the `os-version` capability
    +    exchanges just the operating system name.
    +
    +    Helped-by: Randall S. Becker <rsbecker@nexbridge.com>
         Mentored-by: Christian Couder <chriscool@tuxfamily.org>
         Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
     
    @@ Documentation/config/transfer.txt
     @@ Documentation/config/transfer.txt: transfer.advertiseOSVersion::
      	servers. It makes clients and servers send to each other a string
      	representing the operating system name, like "Linux" or "Windows".
    - 	This string is retrieved from the 'sysname' field of the struct returned
    + 	This string is retrieved from the `sysname` field of the struct returned
     -	by the uname(2) system call. Defaults to true.
     +	by the uname(2) system call. If the `osVersion.command` is set, the
     +	output of the command specified will be the string exchanged by the clients
    @@ Documentation/config/transfer.txt: transfer.advertiseOSVersion::
     +	`transfer.advertiseOSVersion` config option.
     
      ## Documentation/gitprotocol-v2.txt ##
    -@@ Documentation/gitprotocol-v2.txt: in its request to the server (but it MUST NOT do so if the server did
    - not advertise the os-version capability). The `X` and `Y` strings may
    - contain any printable ASCII characters except space (i.e., the byte
    - range 32 < x < 127), and are typically made from the result of
    +@@ Documentation/gitprotocol-v2.txt: the presence or absence of particular features.
    + os-version
    + ~~~~~~~~~~
    + 
    +-In the same way as the `agent` capability above, the server can
    +-advertise the `os-version` capability to notify the client the
    +-kind of operating system it is running on. The client may optionally
    +-send its own `os-version` capability, to notify the server the kind of
    +-operating system it is also running on in its request to the server
    +-(but it MUST NOT do so if the server did not advertise the os-version
    +-capability). The value of this capability may consist of ASCII printable
    ++In the same way as the `agent` capability above, the server can advertise
    ++the `os-version` capability to notify the client the kind of operating system
    ++it is running on. The client may optionally send its own `os-version` capability,
    ++to notify the server the kind of operating system it is also running on in its
    ++request to the server (but it MUST NOT do so if the server did not advertise the
    ++os-version capability). The value of this capability may consist of ASCII printable
    + characters(from 33 to 126 inclusive) and are typically made from the result of
     -`uname -s`(OS name e.g Linux). The os-version capability can be disabled
     -entirely by setting the `transfer.advertiseOSVersion` config option
     -to `false`. The `os-version` strings are purely informative for
     -statistics and debugging purposes, and MUST NOT be used to
     -programmatically assume the presence or absence of particular
     -features.
    -+`uname -s`(OS name e.g Linux).  If the `osVersion.command` is set,
    -+the `X` and `Y` are made from the ouput of the command specified.
    -+The os-version capability can be disabled entirely by setting the
    -+`transfer.advertiseOSVersion` config option to `false`. The `os-version`
    -+strings are purely informative for statistics and debugging purposes, and
    -+MUST NOT be used to programmatically assume the presence or absence of
    -+particular features.
    ++`uname -s`(OS name e.g Linux). If the `osVersion.command` is set, the value of this
    ++capability are made from the ouput of the command specified. The os-version capability
    ++can be disabled entirely by setting the `transfer.advertiseOSVersion` config option
    ++to `false`. The `os-version` strings are purely informative for statistics and
    ++debugging purposes, and MUST NOT be used to programmatically assume the presence or
    ++absence of particular features.
      
      ls-refs
      ~~~~~~~
    @@ t/t5555-http-smart-common.sh: test_expect_success 'git upload-pack --advertise-r
      '
      
     +test_expect_success 'git upload-pack --advertise-refs: v2 with osVersion.command config set' '
    -+	# test_config is used here as we are not reusing any file output from here
     +	test_config osVersion.command "uname -srvm" &&
    -+	printf "agent=FAKE" >agent_and_long_os_name &&
    ++	printf "agent=FAKE" >agent_and_long_osversion &&
     +
     +	if test_have_prereq !WINDOWS
     +	then
    -+		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_os_name
    ++		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
     +	fi &&
     +
     +	cat >expect <<-EOF &&
     +	version 2
    -+	$(cat agent_and_long_os_name)
    ++	$(cat agent_and_long_osversion)
     +	ls-refs=unborn
     +	fetch=shallow wait-for-done
     +	server-option
    @@ t/t5701-git-serve.sh: test_expect_success 'test capability advertisement' '
      '
      
     +test_expect_success 'test capability advertisement with osVersion.command config set' '
    -+	# test_config is used here as we are not reusing any file output from here
     +	test_config osVersion.command "uname -srvm" &&
    -+	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_long_os_name &&
    ++	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_long_osversion &&
     +
     +	if test_have_prereq !WINDOWS
     +	then
    -+		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_os_name
    ++		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
     +	fi &&
     +
     +	test_oid_cache <<-EOF &&
     +	wrong_algo sha1:sha256
     +	wrong_algo sha256:sha1
     +	EOF
    -+	cat >expect.base_long <<-EOF &&
    ++	cat >expect_long.base <<-EOF &&
     +	version 2
    -+	$(cat agent_and_long_os_name)
    ++	$(cat agent_and_long_osversion)
     +	ls-refs=unborn
     +	fetch=shallow wait-for-done
     +	server-option
     +	object-format=$(test_oid algo)
     +	EOF
    -+	cat >expect.trailer_long <<-EOF &&
    -+	0000
    -+	EOF
    -+	cat expect.base_long expect.trailer_long >expect &&
    ++	cat expect_long.base expect.trailer >expect &&
     +
     +	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
     +		--advertise-capabilities >out &&

-- 
2.48.0


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v2 1/6] version: refactor redact_non_printables()
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
@ 2025-01-17 10:46   ` Usman Akinyemi
  2025-01-17 18:26     ` Junio C Hamano
  2025-01-17 10:46   ` [PATCH v2 2/6] version: refactor get_uname_info() Usman Akinyemi
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-17 10:46 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	sunshine, rsbecker, Christian Couder

The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's use strbuf_detach() to explicitly detach the string
contained by the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/version.c b/version.c
index 4d763ab48d..78f025c808 100644
--- a/version.c
+++ b/version.c
@@ -6,6 +6,20 @@
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ * TODO: ensure consecutive non-printable characters are only replaced once
+*/
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 2/6] version: refactor get_uname_info()
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  2025-01-17 10:46   ` [PATCH v2 1/6] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-01-17 10:46   ` Usman Akinyemi
  2025-01-17 10:46   ` [PATCH v2 3/6] version: extend get_uname_info() to hide system details Usman Akinyemi
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-17 10:46 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	sunshine, rsbecker, Christian Couder

Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c | 13 ++-----------
 version.c           | 20 ++++++++++++++++++++
 version.h           |  7 +++++++
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 7c2df035c9..5e13d532a8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -12,10 +12,10 @@
 #include "diagnose.h"
 #include "object-file.h"
 #include "setup.h"
+#include "version.h"
 
 static void get_system_info(struct strbuf *sys_info)
 {
-	struct utsname uname_info;
 	char *shell = NULL;
 
 	/* get git version from native cmd */
@@ -24,16 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	if (uname(&uname_info))
-		strbuf_addf(sys_info, _("uname() failed with error '%s' (%d)\n"),
-			    strerror(errno),
-			    errno);
-	else
-		strbuf_addf(sys_info, "%s %s %s %s\n",
-			    uname_info.sysname,
-			    uname_info.release,
-			    uname_info.version,
-			    uname_info.machine);
+	get_uname_info(sys_info);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 78f025c808..96f474c8e6 100644
--- a/version.c
+++ b/version.c
@@ -2,6 +2,7 @@
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
+#include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -47,3 +48,22 @@ const char *git_user_agent_sanitized(void)
 
 	return agent;
 }
+
+int get_uname_info(struct strbuf *buf)
+{
+	struct utsname uname_info;
+
+	if (uname(&uname_info)) {
+		strbuf_addf(buf, _("uname() failed with error '%s' (%d)\n"),
+			    strerror(errno),
+			    errno);
+		return -1;
+	}
+
+	strbuf_addf(buf, "%s %s %s %s\n",
+		    uname_info.sysname,
+		    uname_info.release,
+		    uname_info.version,
+		    uname_info.machine);
+	return 0;
+}
diff --git a/version.h b/version.h
index 7c62e80577..afe3dbbab7 100644
--- a/version.h
+++ b/version.h
@@ -7,4 +7,11 @@ extern const char git_built_from_commit_string[];
 const char *git_user_agent(void);
 const char *git_user_agent_sanitized(void);
 
+/*
+  Try to get information about the system using uname(2).
+  Return -1 and put an error message into 'buf' in case of uname()
+  error. Return 0 and put uname info into 'buf' otherwise.
+*/
+int get_uname_info(struct strbuf *buf);
+
 #endif /* VERSION_H */
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 3/6] version: extend get_uname_info() to hide system details
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  2025-01-17 10:46   ` [PATCH v2 1/6] version: refactor redact_non_printables() Usman Akinyemi
  2025-01-17 10:46   ` [PATCH v2 2/6] version: refactor get_uname_info() Usman Akinyemi
@ 2025-01-17 10:46   ` Usman Akinyemi
  2025-01-17 18:27     ` Junio C Hamano
  2025-01-17 10:46   ` [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-17 10:46 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	sunshine, rsbecker, Christian Couder

Currently, get_uname_info() function provides the full OS information.
In a follwing commit, we will need it to provide only the OS name.

Let's extend it to accept a "full" flag that makes it switch between
providing full OS information and providing only the OS name.

We may need to refactor this function in the future if an
`osVersion.format` is added.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c |  2 +-
 version.c           | 16 +++++++++-------
 version.h           |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 5e13d532a8..e3288a86c8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -24,7 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	get_uname_info(sys_info);
+	get_uname_info(sys_info, 1);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 96f474c8e6..46835ec83f 100644
--- a/version.c
+++ b/version.c
@@ -49,7 +49,7 @@ const char *git_user_agent_sanitized(void)
 	return agent;
 }
 
-int get_uname_info(struct strbuf *buf)
+int get_uname_info(struct strbuf *buf, unsigned int full)
 {
 	struct utsname uname_info;
 
@@ -59,11 +59,13 @@ int get_uname_info(struct strbuf *buf)
 			    errno);
 		return -1;
 	}
-
-	strbuf_addf(buf, "%s %s %s %s\n",
-		    uname_info.sysname,
-		    uname_info.release,
-		    uname_info.version,
-		    uname_info.machine);
+	if (full)
+		strbuf_addf(buf, "%s %s %s %s\n",
+			    uname_info.sysname,
+			    uname_info.release,
+			    uname_info.version,
+			    uname_info.machine);
+	else
+	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
diff --git a/version.h b/version.h
index afe3dbbab7..5eb586c0bd 100644
--- a/version.h
+++ b/version.h
@@ -12,6 +12,6 @@ const char *git_user_agent_sanitized(void);
   Return -1 and put an error message into 'buf' in case of uname()
   error. Return 0 and put uname info into 'buf' otherwise.
 */
-int get_uname_info(struct strbuf *buf);
+int get_uname_info(struct strbuf *buf, unsigned int full);
 
 #endif /* VERSION_H */
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                     ` (2 preceding siblings ...)
  2025-01-17 10:46   ` [PATCH v2 3/6] version: extend get_uname_info() to hide system details Usman Akinyemi
@ 2025-01-17 10:46   ` Usman Akinyemi
  2025-01-17 19:31     ` Junio C Hamano
  2025-01-17 10:46   ` [PATCH v2 5/6] connect: advertise OS version Usman Akinyemi
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-17 10:46 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	sunshine, rsbecker, Christian Couder

Currently, the "test capability advertisement" test creates some files
with expected content which are used by other tests below it.

To remove that side-effect from this test, let's split up part of
it into a "setup"-type test which creates the files with expected content
which gets reused by multiple tests. This will be useful in a following
commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 t/t5701-git-serve.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index de904c1655..0c0a5b2aec 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -7,14 +7,17 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
-test_expect_success 'test capability advertisement' '
+test_expect_success 'setup to generate files with expected content' '
+	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_osversion &&
+
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
+
 	cat >expect.base <<-EOF &&
 	version 2
-	agent=git/$(git version | cut -d" " -f3)
+	$(cat agent_and_osversion)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
@@ -23,6 +26,9 @@ test_expect_success 'test capability advertisement' '
 	cat >expect.trailer <<-EOF &&
 	0000
 	EOF
+'
+
+test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 5/6] connect: advertise OS version
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                     ` (3 preceding siblings ...)
  2025-01-17 10:46   ` [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-01-17 10:46   ` Usman Akinyemi
  2025-01-17 19:35     ` Junio C Hamano
  2025-01-17 22:22     ` Junio C Hamano
  2025-01-17 10:46   ` [PATCH v2 6/6] version: introduce osversion.command config for os-version output Usman Akinyemi
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  6 siblings, 2 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-17 10:46 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	sunshine, rsbecker, Christian Couder

As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Let's introduce a new protocol (`os-version`) allowing Git clients and
servers to exchange operating system information. The protocol is
controlled by the new `transfer.advertiseOSVersion` config option.

Add the `transfer.advertiseOSVersion` config option to address
privacy concerns. It defaults to `true` and can be changed to
`false`. When enabled, this option makes clients and servers send each
other the OS name (e.g., "Linux" or "Windows"). The information is
retrieved using the 'sysname' field of the `uname(2)` system call.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

On Windows, uname(2) is not actually system-supplied but is instead
already faked up by Git itself. We could have overcome the test issue
on Windows by implementing a new `uname` subcommand in `test-tool`
using uname(2), but except uname(2), which would be tested against
itself, there would be nothing platform specific, so it's just simpler
to disable the tests on Windows.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/config/transfer.txt |  7 ++++++
 Documentation/gitprotocol-v2.txt  | 18 +++++++++++++
 connect.c                         |  3 +++
 serve.c                           | 14 +++++++++++
 t/t5555-http-smart-common.sh      | 10 +++++++-
 t/t5701-git-serve.sh              | 20 +++++++++++++--
 t/test-lib-functions.sh           |  8 ++++++
 version.c                         | 42 +++++++++++++++++++++++++++++++
 version.h                         | 21 ++++++++++++++++
 9 files changed, 140 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index f1ce50f4a6..c368a893bd 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -125,3 +125,10 @@ transfer.bundleURI::
 transfer.advertiseObjectInfo::
 	When `true`, the `object-info` capability is advertised by
 	servers. Defaults to false.
+
+transfer.advertiseOSVersion::
+	When `true`, the `os-version` capability is advertised by clients and
+	servers. It makes clients and servers send to each other a string
+	representing the operating system name, like "Linux" or "Windows".
+	This string is retrieved from the `sysname` field of the struct returned
+	by the uname(2) system call. Defaults to true.
diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 1652fef3ae..a332b55e4c 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -190,6 +190,24 @@ printable ASCII characters except space (i.e., the byte range 32 < x <
 and debugging purposes, and MUST NOT be used to programmatically assume
 the presence or absence of particular features.
 
+os-version
+~~~~~~~~~~
+
+In the same way as the `agent` capability above, the server can
+advertise the `os-version` capability to notify the client the
+kind of operating system it is running on. The client may optionally
+send its own `os-version` capability, to notify the server the kind of
+operating system it is also running on in its request to the server
+(but it MUST NOT do so if the server did not advertise the os-version
+capability). The value of this capability may consist of ASCII printable
+characters(from 33 to 126 inclusive) and are typically made from the result of
+`uname -s`(OS name e.g Linux). The os-version capability can be disabled
+entirely by setting the `transfer.advertiseOSVersion` config option
+to `false`. The `os-version` strings are purely informative for
+statistics and debugging purposes, and MUST NOT be used to
+programmatically assume the presence or absence of particular
+features.
+
 ls-refs
 ~~~~~~~
 
diff --git a/connect.c b/connect.c
index 10fad43e98..6d5792b63c 100644
--- a/connect.c
+++ b/connect.c
@@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
 	if (server_supports_v2("agent"))
 		packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
 
+	if (server_supports_v2("os-version") && advertise_os_version(the_repository))
+		packet_write_fmt(fd_out, "os-version=%s", os_version_sanitized());
+
 	if (server_feature_v2("object-format", &hash_name)) {
 		int hash_algo = hash_algo_by_name(hash_name);
 		if (hash_algo == GIT_HASH_UNKNOWN)
diff --git a/serve.c b/serve.c
index c8694e3751..5b0d54ae9a 100644
--- a/serve.c
+++ b/serve.c
@@ -31,6 +31,16 @@ static int agent_advertise(struct repository *r UNUSED,
 	return 1;
 }
 
+static int os_version_advertise(struct repository *r,
+			   struct strbuf *value)
+{
+	if (!advertise_os_version(r))
+		return 0;
+	if (value)
+		strbuf_addstr(value, os_version_sanitized());
+	return 1;
+}
+
 static int object_format_advertise(struct repository *r,
 				   struct strbuf *value)
 {
@@ -123,6 +133,10 @@ static struct protocol_capability capabilities[] = {
 		.name = "agent",
 		.advertise = agent_advertise,
 	},
+	{
+		.name = "os-version",
+		.advertise = os_version_advertise,
+	},
 	{
 		.name = "ls-refs",
 		.advertise = ls_refs_advertise,
diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
index e47ea1ad10..6f357a005a 100755
--- a/t/t5555-http-smart-common.sh
+++ b/t/t5555-http-smart-common.sh
@@ -123,9 +123,17 @@ test_expect_success 'git receive-pack --advertise-refs: v1' '
 '
 
 test_expect_success 'git upload-pack --advertise-refs: v2' '
+	printf "agent=FAKE" >agent_and_osversion &&
+	if test_have_prereq WINDOWS
+	then
+		git config transfer.advertiseOSVersion false
+	else
+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
+	fi &&
+
 	cat >expect <<-EOF &&
 	version 2
-	agent=FAKE
+	$(cat agent_and_osversion)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 0c0a5b2aec..8a783b3924 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -26,10 +26,26 @@ test_expect_success 'setup to generate files with expected content' '
 	cat >expect.trailer <<-EOF &&
 	0000
 	EOF
+
+	if test_have_prereq WINDOWS
+	then
+		git config transfer.advertiseOSVersion false
+	else
+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
+	fi &&
+
+	cat >expect_osversion.base <<-EOF
+	version 2
+	$(cat agent_and_osversion)
+	ls-refs=unborn
+	fetch=shallow wait-for-done
+	server-option
+	object-format=$(test_oid algo)
+	EOF
 '
 
 test_expect_success 'test capability advertisement' '
-	cat expect.base expect.trailer >expect &&
+	cat expect_osversion.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
@@ -357,7 +373,7 @@ test_expect_success 'test capability advertisement with uploadpack.advertiseBund
 	cat >expect.extra <<-EOF &&
 	bundle-uri
 	EOF
-	cat expect.base \
+	cat expect_osversion.base \
 	    expect.extra \
 	    expect.trailer >expect &&
 
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..f7ff38521c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -2007,3 +2007,11 @@ test_trailing_hash () {
 		test-tool hexdump |
 		sed "s/ //g"
 }
+
+# Trim and replace each character with ascii code below 32 or above
+# 127 (included) using a dot '.' character.
+# Octal intervals \001-\040 and \177-\377
+# corresponds to decimal intervals 1-32 and 127-255
+test_redact_non_printables () {
+    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
+}
diff --git a/version.c b/version.c
index 46835ec83f..ea334c3e9c 100644
--- a/version.c
+++ b/version.c
@@ -3,6 +3,7 @@
 #include "version-def.h"
 #include "strbuf.h"
 #include "gettext.h"
+#include "config.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -69,3 +70,44 @@ int get_uname_info(struct strbuf *buf, unsigned int full)
 	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
+
+const char *os_version(void)
+{
+	static const char *os = NULL;
+
+	if (!os) {
+		struct strbuf buf = STRBUF_INIT;
+
+		get_uname_info(&buf, 0);
+		os = strbuf_detach(&buf, NULL);
+	}
+
+	return os;
+}
+
+const char *os_version_sanitized(void)
+{
+	static const char *os_sanitized = NULL;
+
+	if (!os_sanitized) {
+		struct strbuf buf = STRBUF_INIT;
+
+		strbuf_addstr(&buf, os_version());
+		redact_non_printables(&buf);
+		os_sanitized = strbuf_detach(&buf, NULL);
+	}
+
+	return os_sanitized;
+}
+
+int advertise_os_version(struct repository *r)
+{
+	static int transfer_advertise_os_version = -1;
+
+	if (transfer_advertise_os_version == -1) {
+		repo_config_get_bool(r, "transfer.advertiseosversion", &transfer_advertise_os_version);
+		/* enabled by default */
+		transfer_advertise_os_version = !!transfer_advertise_os_version;
+	}
+	return transfer_advertise_os_version;
+}
diff --git a/version.h b/version.h
index 5eb586c0bd..3e983bc623 100644
--- a/version.h
+++ b/version.h
@@ -1,6 +1,8 @@
 #ifndef VERSION_H
 #define VERSION_H
 
+struct repository;
+
 extern const char git_version_string[];
 extern const char git_built_from_commit_string[];
 
@@ -14,4 +16,23 @@ const char *git_user_agent_sanitized(void);
 */
 int get_uname_info(struct strbuf *buf, unsigned int full);
 
+/*
+  Retrieve and cache system information for subsequent calls.
+  Return a pointer to the cached system information string.
+*/
+const char *os_version(void);
+
+/*
+  Retrieve system information string from os_version(). Then
+  sanitize and cache it. Return a pointer to the sanitized
+  system information string.
+*/
+const char *os_version_sanitized(void);
+
+/*
+  Retrieve and cache whether os-version capability is enabled.
+  Return 1 if enabled, 0 if disabled.
+*/
+int advertise_os_version(struct repository *r);
+
 #endif /* VERSION_H */
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                     ` (4 preceding siblings ...)
  2025-01-17 10:46   ` [PATCH v2 5/6] connect: advertise OS version Usman Akinyemi
@ 2025-01-17 10:46   ` Usman Akinyemi
  2025-01-17 21:44     ` Eric Sunshine
  2025-01-17 22:33     ` Junio C Hamano
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  6 siblings, 2 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-17 10:46 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	sunshine, rsbecker, Christian Couder

Currently by default, the new `os-version` capability only exchange the
operating system name between servers and clients i.e "Linux" or
"Windows".

Let's introduce a new configuration option, `osversion.command`, to handle
the string exchange between servers and clients. This option allows
customization of the exchanged string by leveraging the output of the
specified command. This customization might be especially useful on some
quite uncommon platforms like NonStop where interesting OS information is
available from other means than uname(2).

If this new configuration option is not set, the `os-version` capability
exchanges just the operating system name.

Helped-by: Randall S. Becker <rsbecker@nexbridge.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/config/transfer.txt | 11 ++++++-
 Documentation/gitprotocol-v2.txt  | 25 ++++++++-------
 t/t5555-http-smart-common.sh      | 28 +++++++++++++++++
 t/t5701-git-serve.sh              | 29 ++++++++++++++++++
 version.c                         | 51 ++++++++++++++++++++++++++++++-
 5 files changed, 129 insertions(+), 15 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index c368a893bd..c9f38c5796 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -131,4 +131,13 @@ transfer.advertiseOSVersion::
 	servers. It makes clients and servers send to each other a string
 	representing the operating system name, like "Linux" or "Windows".
 	This string is retrieved from the `sysname` field of the struct returned
-	by the uname(2) system call. Defaults to true.
+	by the uname(2) system call. If the `osVersion.command` is set, the
+	output of the command specified will be the string exchanged by the clients
+	and the servers. Defaults to true.
+
+osVersion.command::
+	If this variable is set, the specified command will be run and the output
+	will be used as the value `X` for `os-version` capability (in the form
+	`os-version=X`). `osVersion.command` is only used if `transfer.advertiseOSVersion`
+	is true. Refer to the linkgit:git-config[1] documentation to learn more about
+	`transfer.advertiseOSVersion` config option.
diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index a332b55e4c..93a2e97ec0 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -193,20 +193,19 @@ the presence or absence of particular features.
 os-version
 ~~~~~~~~~~
 
-In the same way as the `agent` capability above, the server can
-advertise the `os-version` capability to notify the client the
-kind of operating system it is running on. The client may optionally
-send its own `os-version` capability, to notify the server the kind of
-operating system it is also running on in its request to the server
-(but it MUST NOT do so if the server did not advertise the os-version
-capability). The value of this capability may consist of ASCII printable
+In the same way as the `agent` capability above, the server can advertise
+the `os-version` capability to notify the client the kind of operating system
+it is running on. The client may optionally send its own `os-version` capability,
+to notify the server the kind of operating system it is also running on in its
+request to the server (but it MUST NOT do so if the server did not advertise the
+os-version capability). The value of this capability may consist of ASCII printable
 characters(from 33 to 126 inclusive) and are typically made from the result of
-`uname -s`(OS name e.g Linux). The os-version capability can be disabled
-entirely by setting the `transfer.advertiseOSVersion` config option
-to `false`. The `os-version` strings are purely informative for
-statistics and debugging purposes, and MUST NOT be used to
-programmatically assume the presence or absence of particular
-features.
+`uname -s`(OS name e.g Linux). If the `osVersion.command` is set, the value of this
+capability are made from the ouput of the command specified. The os-version capability
+can be disabled entirely by setting the `transfer.advertiseOSVersion` config option
+to `false`. The `os-version` strings are purely informative for statistics and
+debugging purposes, and MUST NOT be used to programmatically assume the presence or
+absence of particular features.
 
 ls-refs
 ~~~~~~~
diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
index 6f357a005a..1a3df3d090 100755
--- a/t/t5555-http-smart-common.sh
+++ b/t/t5555-http-smart-common.sh
@@ -150,6 +150,34 @@ test_expect_success 'git upload-pack --advertise-refs: v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'git upload-pack --advertise-refs: v2 with osVersion.command config set' '
+	test_config osVersion.command "uname -srvm" &&
+	printf "agent=FAKE" >agent_and_long_osversion &&
+
+	if test_have_prereq !WINDOWS
+	then
+		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
+	fi &&
+
+	cat >expect <<-EOF &&
+	version 2
+	$(cat agent_and_long_osversion)
+	ls-refs=unborn
+	fetch=shallow wait-for-done
+	server-option
+	object-format=$(test_oid algo)
+	0000
+	EOF
+
+	GIT_PROTOCOL=version=2 \
+	GIT_USER_AGENT=FAKE \
+	git upload-pack --advertise-refs . >out 2>err &&
+
+	test-tool pkt-line unpack <out >actual &&
+	test_must_be_empty err &&
+	test_cmp actual expect
+'
+
 test_expect_success 'git receive-pack --advertise-refs: v2' '
 	# There is no v2 yet for receive-pack, implicit v0
 	cat >expect <<-EOF &&
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 8a783b3924..1395ac4eba 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -53,6 +53,35 @@ test_expect_success 'test capability advertisement' '
 	test_cmp expect actual
 '
 
+test_expect_success 'test capability advertisement with osVersion.command config set' '
+	test_config osVersion.command "uname -srvm" &&
+	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_long_osversion &&
+
+	if test_have_prereq !WINDOWS
+	then
+		printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
+	fi &&
+
+	test_oid_cache <<-EOF &&
+	wrong_algo sha1:sha256
+	wrong_algo sha256:sha1
+	EOF
+	cat >expect_long.base <<-EOF &&
+	version 2
+	$(cat agent_and_long_osversion)
+	ls-refs=unborn
+	fetch=shallow wait-for-done
+	server-option
+	object-format=$(test_oid algo)
+	EOF
+	cat expect_long.base expect.trailer >expect &&
+
+	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
+		--advertise-capabilities >out &&
+	test-tool pkt-line unpack <out >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'stateless-rpc flag does not list capabilities' '
 	# Empty request
 	test-tool pkt-line pack >in <<-EOF &&
diff --git a/version.c b/version.c
index ea334c3e9c..2aa55e56b5 100644
--- a/version.c
+++ b/version.c
@@ -1,9 +1,13 @@
+#define USE_THE_REPOSITORY_VARIABLE
+
 #include "git-compat-util.h"
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
 #include "gettext.h"
 #include "config.h"
+#include "run-command.h"
+#include "alias.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -71,6 +75,50 @@ int get_uname_info(struct strbuf *buf, unsigned int full)
 	return 0;
 }
 
+/*
+ * Return -1 if unable to retrieve the osversion.command config or
+ * if the command is malformed; otherwise, return 0 if successful.
+ */
+static int fill_os_version_command(struct child_process *cmd)
+{
+	const char *os_version_command;
+	const char **argv;
+	char *os_version_copy;
+	int n;
+
+	if (git_config_get_string_tmp("osversion.command", &os_version_command))
+		return -1;
+
+	os_version_copy = xstrdup(os_version_command);
+	n = split_cmdline(os_version_copy, &argv);
+
+	if (n < 0) {
+		warning(_("malformed osVersion.command config option: %s"),
+			_(split_cmdline_strerror(n)));
+		free(os_version_copy);
+		return -1;
+	}
+
+	for (int i = 0; i < n; i++)
+		strvec_push(&cmd->args, argv[i]);
+	free(os_version_copy);
+	free(argv);
+
+	return 0;
+}
+
+static int capture_os_version(struct strbuf *buf)
+{
+	struct child_process cmd = CHILD_PROCESS_INIT;
+
+	if (fill_os_version_command(&cmd))
+		return -1;
+	if (capture_command(&cmd, buf, 0))
+		return -1;
+
+	return 0;
+}
+
 const char *os_version(void)
 {
 	static const char *os = NULL;
@@ -78,7 +126,8 @@ const char *os_version(void)
 	if (!os) {
 		struct strbuf buf = STRBUF_INIT;
 
-		get_uname_info(&buf, 0);
+		if (capture_os_version(&buf))
+			get_uname_info(&buf, 0);
 		os = strbuf_detach(&buf, NULL);
 	}
 
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 1/6] version: refactor redact_non_printables()
  2025-01-17 10:46   ` [PATCH v2 1/6] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-01-17 18:26     ` Junio C Hamano
  2025-01-17 19:48       ` Junio C Hamano
  2025-01-20 17:10       ` Usman Akinyemi
  0 siblings, 2 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 18:26 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> The git_user_agent_sanitized() function performs some sanitizing to
> avoid special characters being sent over the line and possibly messing
> up with the protocol or with the parsing on the other side.
>
> Let's extract this sanitizing into a new redact_non_printables() function,
> as we will want to reuse it in a following patch.
>
> For now the new redact_non_printables() function is still static as
> it's only needed locally.
>
> While at it, let's use strbuf_detach() to explicitly detach the string
> contained by the 'buf' strbuf.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
>  version.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/version.c b/version.c
> index 4d763ab48d..78f025c808 100644
> --- a/version.c
> +++ b/version.c
> @@ -6,6 +6,20 @@
>  const char git_version_string[] = GIT_VERSION;
>  const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
>  
> +/*
> + * Trim and replace each character with ascii code below 32 or above
> + * 127 (included) using a dot '.' character.

/*
 * Trim and replace each byte outside ASCII printable
 * (33 to 127, inclusive) with a dot '.'.
 */

perhaps?

> + * TODO: ensure consecutive non-printable characters are only replaced once

I am not sure what your plans are for this change.  Has the list
reached the consensus to squish consecutive redaction dots into one
in the user-agent string?  If not, let's not mention it.  Making an
incompatible change to the user-agent string is not the primary aim
of this topic anyway.

> +*/

Funny indentation.  The asterisk should have a SP before it, just
like on the previous lines.

> +static void redact_non_printables(struct strbuf *buf)
> +{
> +	strbuf_trim(buf);
> +	for (size_t i = 0; i < buf->len; i++) {
> +		if (buf->buf[i] <= 32 || buf->buf[i] >= 127)

<sane-ctype.h> defines isprint() we can use here.

> +			buf->buf[i] = '.';
> +	}
> +}

Do we want to do anything special when the resulting buf->buf[]
becomes empty or just full of dots without anything else?  Should
the caller be told about such a condition, or is it callers'
responsibility to check if they care?  I am inclined to say that it
is the latter.

> @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
>  		struct strbuf buf = STRBUF_INIT;
>  
>  		strbuf_addstr(&buf, git_user_agent());
> -		strbuf_trim(&buf);
> -		for (size_t i = 0; i < buf.len; i++) {
> -			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> -				buf.buf[i] = '.';
> -		}
> -		agent = buf.buf;
> +		redact_non_printables(&buf);
> +		agent = strbuf_detach(&buf, NULL);
>  	}
>  
>  	return agent;

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 3/6] version: extend get_uname_info() to hide system details
  2025-01-17 10:46   ` [PATCH v2 3/6] version: extend get_uname_info() to hide system details Usman Akinyemi
@ 2025-01-17 18:27     ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 18:27 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Currently, get_uname_info() function provides the full OS information.
> In a follwing commit, we will need it to provide only the OS name.
>
> Let's extend it to accept a "full" flag that makes it switch between
> providing full OS information and providing only the OS name.
>
> We may need to refactor this function in the future if an
> `osVersion.format` is added.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---

Nice that this is made into a separate commit from the previous step.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency
  2025-01-17 10:46   ` [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-01-17 19:31     ` Junio C Hamano
  2025-01-20 17:32       ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 19:31 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> -test_expect_success 'test capability advertisement' '
> +test_expect_success 'setup to generate files with expected content' '
> +	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_osversion &&

Is this required to be "printf" and not "echo", if so why?

"git version" could contain any character if the builder gives a
custom version string by saving it in the "version" file (we use the
mechanism when we create a distribution tarball, for example).  What
happens if it contains say "%s" or something?

If you _really_ need to use printf, you'd want to do so more like:

	printf "agent=git/%s" "$(git version | cut ...)"

Is it required that agent_and_osversion lack the terminating LF?
The use of printf without terminating "\n" at the end of the format
string hints the readers that it is the case.  If you did not intend
that, perhaps doing

	printf "agent=git/%s\n" "$(git version | cut ...)"

would avoid misleading them.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 5/6] connect: advertise OS version
  2025-01-17 10:46   ` [PATCH v2 5/6] connect: advertise OS version Usman Akinyemi
@ 2025-01-17 19:35     ` Junio C Hamano
  2025-01-17 22:22     ` Junio C Hamano
  1 sibling, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 19:35 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> +os-version
> +~~~~~~~~~~
> +
> ...
> +characters(from 33 to 126 inclusive) and are typically made from the result of

Compared to the preceding few paragraphs, this paragraph is overly
wide (the previous iteration was much better).

I'll review this step separately later.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 1/6] version: refactor redact_non_printables()
  2025-01-17 18:26     ` Junio C Hamano
@ 2025-01-17 19:48       ` Junio C Hamano
  2025-01-20 17:10       ` Usman Akinyemi
  1 sibling, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 19:48 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Junio C Hamano <gitster@pobox.com> writes:

> /*
>  * Trim and replace each byte outside ASCII printable
>  * (33 to 127, inclusive) with a dot '.'.
>  */
>
> perhaps?

"127" -> "126"; that is what an inclusive range should say.

Sorry for a noise.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 10:46   ` [PATCH v2 6/6] version: introduce osversion.command config for os-version output Usman Akinyemi
@ 2025-01-17 21:44     ` Eric Sunshine
  2025-01-20 18:17       ` Usman Akinyemi
  2025-01-17 22:33     ` Junio C Hamano
  1 sibling, 1 reply; 108+ messages in thread
From: Eric Sunshine @ 2025-01-17 21:44 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, rsbecker, Christian Couder

On Fri, Jan 17, 2025 at 5:47 AM Usman Akinyemi
<usmanakinyemi202@gmail.com> wrote:
> Currently by default, the new `os-version` capability only exchange the
> operating system name between servers and clients i.e "Linux" or
> "Windows".
>
> Let's introduce a new configuration option, `osversion.command`, to handle
> the string exchange between servers and clients. This option allows
> customization of the exchanged string by leveraging the output of the
> specified command. This customization might be especially useful on some
> quite uncommon platforms like NonStop where interesting OS information is
> available from other means than uname(2).
>
> If this new configuration option is not set, the `os-version` capability
> exchanges just the operating system name.
>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
> diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
> @@ -150,6 +150,34 @@ test_expect_success 'git upload-pack --advertise-refs: v2' '
> +test_expect_success 'git upload-pack --advertise-refs: v2 with osVersion.command config set' '
> +       test_config osVersion.command "uname -srvm" &&
> +       printf "agent=FAKE" >agent_and_long_osversion &&
> +
> +       if test_have_prereq !WINDOWS
> +       then
> +               printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
> +       fi &&

As an aid to future readers, please add an explanation either in the
commit message or as a comment here in the code explaining why Windows
is being singled out as special.

> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> @@ -53,6 +53,35 @@ test_expect_success 'test capability advertisement' '
> +test_expect_success 'test capability advertisement with osVersion.command config set' '
> +       test_config osVersion.command "uname -srvm" &&
> +       printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_long_osversion &&
> +
> +       if test_have_prereq !WINDOWS
> +       then
> +               printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
> +       fi &&

Ditto.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 5/6] connect: advertise OS version
  2025-01-17 10:46   ` [PATCH v2 5/6] connect: advertise OS version Usman Akinyemi
  2025-01-17 19:35     ` Junio C Hamano
@ 2025-01-17 22:22     ` Junio C Hamano
  2025-01-17 22:47       ` rsbecker
  2025-01-20 18:15       ` Usman Akinyemi
  1 sibling, 2 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 22:22 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> As some issues that can happen with a Git client can be operating system
> specific, it can be useful for a server to know which OS a client is
> using. In the same way it can be useful for a client to know which OS
> a server is using.

Hmph.  The other end may be running different version of Git, and
the version difference of _our_ software is probably more relevant.
For that matter, they may even be running something entirely
different from our software, like Gerrit.  So I am not sure I am
convinced that os-version thing is a good thing to have with that
paragraph.

> Let's introduce a new protocol (`os-version`) allowing Git clients and
> servers to exchange operating system information. The protocol is
> controlled by the new `transfer.advertiseOSVersion` config option.

The last sentence is redundant and can safely removed.  The next
paragraph describes it better than "is controlled by".

> Add the `transfer.advertiseOSVersion` config option to address
> privacy concerns. It defaults to `true` and can be changed to
> `false`. When enabled, this option makes clients and servers send each
> other the OS name (e.g., "Linux" or "Windows"). The information is
> retrieved using the 'sysname' field of the `uname(2)` system call.

Add "or its equivalent" at the end.

macOS may have one, but it probably is not quite correct to say that
Windows have uname system call (otherwise we wouldn't be emulating
it on top of GetVersion ourselves).

> However, there are differences between `uname(1)` (command-line utility)
> and `uname(2)` (system call) outputs on Windows. These discrepancies
> complicate testing on Windows platforms. For example:
>   - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
>   .2024-02-14.20:17.UTC.x86_64
>   - `uname(2)` output: Windows.10.0.20348
>
> On Windows, uname(2) is not actually system-supplied but is instead
> already faked up by Git itself. We could have overcome the test issue
> on Windows by implementing a new `uname` subcommand in `test-tool`
> using uname(2), but except uname(2), which would be tested against
> itself, there would be nothing platform specific, so it's just simpler
> to disable the tests on Windows.

OK.

> +transfer.advertiseOSVersion::
> +	When `true`, the `os-version` capability is advertised by clients and
> +	servers. It makes clients and servers send to each other a string
> +	representing the operating system name, like "Linux" or "Windows".
> +	This string is retrieved from the `sysname` field of the struct returned
> +	by the uname(2) system call. Defaults to true.

Presumably, both ends of the connection independently choose whether
they enable or disable this variable, so we have 2x2=4 combinations
(here, versions of Git before the os-version capability support is
introduced behave the same way as an installation with this
configuration variable set to false).

And among these four combinations, only one of them results in "send
to each other", but the description above is fuzzy.

> diff --git a/connect.c b/connect.c
> index 10fad43e98..6d5792b63c 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
>  	if (server_supports_v2("agent"))
>  		packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
>  
> +	if (server_supports_v2("os-version") && advertise_os_version(the_repository))
> +		packet_write_fmt(fd_out, "os-version=%s", os_version_sanitized());

Not a new problem, because the new code is pretty-much a straight
copy from the existing "agent" code, but do we ever use unsanitized
versions of git-user-agent and os-version?  If not, I am wondering
if we should sanitize immediately when we obtain the raw string and
keep it, get rid of _santized() function from the public API, and
make anybody calling git_user_agent() and os_version() to get
sanitized safe-to-use strings.

I see http.c throws git_user_agent() without doing any sanitization
at the cURL library, but it may be a mistake that we may want to fix
(outside the scope of this topic).  Since the contrast between the
os_version() vs the os_version_sanitized() is *new* in this series,
however, we probably would want to get it right from the beginning.

So the question is again, do we ever need to use os_version() that
is a raw string that may require sanitizing?  I do not think of any
offhand.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 10:46   ` [PATCH v2 6/6] version: introduce osversion.command config for os-version output Usman Akinyemi
  2025-01-17 21:44     ` Eric Sunshine
@ 2025-01-17 22:33     ` Junio C Hamano
  2025-01-17 22:49       ` rsbecker
  2025-01-20 18:58       ` Usman Akinyemi
  1 sibling, 2 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 22:33 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Let's introduce a new configuration option, `osversion.command`, to handle
> the string exchange between servers and clients. This option allows
> customization of the exchanged string by leveraging the output of the
> specified command. This customization might be especially useful on some
> quite uncommon platforms like NonStop where interesting OS information is
> available from other means than uname(2).

After reading the above rationale, I doubt the usefulness of this
feature even more.

Shouldn't that kind of anomalies be handled by compat/ layer to make
their uname(2) emulated, or allow get_uname_info() to be customized
at compile time by platform implementations, to yield more useful
pieces of information instead?

That way, we do not need to add another mechanism that lets people
spawn an arbitrary command while Git is running, we do not need to
worry about security implications, and we do not need to worry about
people abusing the facility to throw totally random and useless
garbage information at the other end to make their stats useless.

I'll skip the overly wide documentation changes.

> diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
> ...
> diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> ...

> +test_expect_success 'test capability advertisement with osVersion.command config set' '
> +	test_config osVersion.command "uname -srvm" &&

If osversion.command configuration variable turns out to be
acceptable addition, I do not think we want to use "uname -srvm" as
its value for its test.  Do you know for sure how portable srvm is?

If you use something like "printf ' \001a\011b\015\012c '", you do
not even have to worry about how portable srvm is and on top, you
can test your unprintable-redacting logic in the code.

But all of that may be moot, if we take the "fewer customization at
runtime" approach.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* RE: [PATCH v2 5/6] connect: advertise OS version
  2025-01-17 22:22     ` Junio C Hamano
@ 2025-01-17 22:47       ` rsbecker
  2025-01-17 23:04         ` Junio C Hamano
  2025-01-20 18:15       ` Usman Akinyemi
  1 sibling, 1 reply; 108+ messages in thread
From: rsbecker @ 2025-01-17 22:47 UTC (permalink / raw)
  To: 'Junio C Hamano', 'Usman Akinyemi'
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, 'Christian Couder'

On January 17, 2025 5:22 PM, Junio C Hamano wrote:
>Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
>> As some issues that can happen with a Git client can be operating
>> system specific, it can be useful for a server to know which OS a
>> client is using. In the same way it can be useful for a client to know
>> which OS a server is using.
>
>Hmph.  The other end may be running different version of Git, and the
version
>difference of _our_ software is probably more relevant.
>For that matter, they may even be running something entirely different from
our
>software, like Gerrit.  So I am not sure I am convinced that os-version
thing is a good
>thing to have with that paragraph.
>
>> Let's introduce a new protocol (`os-version`) allowing Git clients and
>> servers to exchange operating system information. The protocol is
>> controlled by the new `transfer.advertiseOSVersion` config option.
>
>The last sentence is redundant and can safely removed.  The next paragraph
>describes it better than "is controlled by".
>
>> Add the `transfer.advertiseOSVersion` config option to address privacy
>> concerns. It defaults to `true` and can be changed to `false`. When
>> enabled, this option makes clients and servers send each other the OS
>> name (e.g., "Linux" or "Windows"). The information is retrieved using
>> the 'sysname' field of the `uname(2)` system call.
>
>Add "or its equivalent" at the end.

>macOS may have one, but it probably is not quite correct to say that
Windows have
>uname system call (otherwise we wouldn't be emulating it on top of
GetVersion
>ourselves).
>
>> However, there are differences between `uname(1)` (command-line
>> utility) and `uname(2)` (system call) outputs on Windows. These
>> discrepancies complicate testing on Windows platforms. For example:
>>   - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
>>   .2024-02-14.20:17.UTC.x86_64
>>   - `uname(2)` output: Windows.10.0.20348
>>
>> On Windows, uname(2) is not actually system-supplied but is instead
>> already faked up by Git itself. We could have overcome the test issue
>> on Windows by implementing a new `uname` subcommand in `test-tool`
>> using uname(2), but except uname(2), which would be tested against
>> itself, there would be nothing platform specific, so it's just simpler
>> to disable the tests on Windows.
>
>OK.
>
>> +transfer.advertiseOSVersion::
>> +	When `true`, the `os-version` capability is advertised by clients
and
>> +	servers. It makes clients and servers send to each other a string
>> +	representing the operating system name, like "Linux" or "Windows".
>> +	This string is retrieved from the `sysname` field of the struct
returned
>> +	by the uname(2) system call. Defaults to true.
>
>Presumably, both ends of the connection independently choose whether they
>enable or disable this variable, so we have 2x2=4 combinations (here,
versions of
>Git before the os-version capability support is introduced behave the same
way as
>an installation with this configuration variable set to false).
>
>And among these four combinations, only one of them results in "send to
each
>other", but the description above is fuzzy.
>
>> diff --git a/connect.c b/connect.c
>> index 10fad43e98..6d5792b63c 100644
>> --- a/connect.c
>> +++ b/connect.c
>> @@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct
>packet_reader *reader)
>>  	if (server_supports_v2("agent"))
>>  		packet_write_fmt(fd_out, "agent=%s",
git_user_agent_sanitized());
>>
>> +	if (server_supports_v2("os-version") &&
>advertise_os_version(the_repository))
>> +		packet_write_fmt(fd_out, "os-version=%s",
>os_version_sanitized());
>
>Not a new problem, because the new code is pretty-much a straight copy from
the
>existing "agent" code, but do we ever use unsanitized versions of
git-user-agent and
>os-version?  If not, I am wondering if we should sanitize immediately when
we
>obtain the raw string and keep it, get rid of _santized() function from the
public API,
>and make anybody calling git_user_agent() and os_version() to get sanitized
safe-
>to-use strings.
>
>I see http.c throws git_user_agent() without doing any sanitization at the
cURL
>library, but it may be a mistake that we may want to fix (outside the scope
of this
>topic).  Since the contrast between the
>os_version() vs the os_version_sanitized() is *new* in this series,
however, we
>probably would want to get it right from the beginning.
>
>So the question is again, do we ever need to use os_version() that is a raw
string
>that may require sanitizing?  I do not think of any offhand.

uname(2) is definitely not portable. uname(1) is almost always available,
but
there is no guarantee about uname(2). I am not entirely happy having my
builds break if having to write one between rc0 and rc1 when this rolls. How
is this being handled? os_version() is also not portable. What if we had
something that asked for specific elements of the string, by name or id.



^ permalink raw reply	[flat|nested] 108+ messages in thread

* RE: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 22:33     ` Junio C Hamano
@ 2025-01-17 22:49       ` rsbecker
  2025-01-17 23:06         ` Junio C Hamano
  2025-01-20 18:58       ` Usman Akinyemi
  1 sibling, 1 reply; 108+ messages in thread
From: rsbecker @ 2025-01-17 22:49 UTC (permalink / raw)
  To: 'Junio C Hamano', 'Usman Akinyemi'
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, 'Christian Couder'

On January 17, 2025 5:34 PM, Junio C Hamano wrote:
>Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
>> Let's introduce a new configuration option, `osversion.command`, to
>> handle the string exchange between servers and clients. This option
>> allows customization of the exchanged string by leveraging the output
>> of the specified command. This customization might be especially
>> useful on some quite uncommon platforms like NonStop where interesting
>> OS information is available from other means than uname(2).
>
>After reading the above rationale, I doubt the usefulness of this feature
even more.
>
>Shouldn't that kind of anomalies be handled by compat/ layer to make their
>uname(2) emulated, or allow get_uname_info() to be customized at compile
time
>by platform implementations, to yield more useful pieces of information
instead?
>
>That way, we do not need to add another mechanism that lets people spawn an
>arbitrary command while Git is running, we do not need to worry about
security
>implications, and we do not need to worry about people abusing the facility
to
>throw totally random and useless garbage information at the other end to
make
>their stats useless.
>
>I'll skip the overly wide documentation changes.
>
>> diff --git a/Documentation/config/transfer.txt
>> b/Documentation/config/transfer.txt
>> ...
>> diff --git a/Documentation/gitprotocol-v2.txt
>> b/Documentation/gitprotocol-v2.txt
>> ...
>
>> +test_expect_success 'test capability advertisement with
osVersion.command
>config set' '
>> +	test_config osVersion.command "uname -srvm" &&
>
>If osversion.command configuration variable turns out to be acceptable
addition, I
>do not think we want to use "uname -srvm" as its value for its test.  Do
you know
>for sure how portable srvm is?
>
>If you use something like "printf ' \001a\011b\015\012c '", you do not even
have
>to worry about how portable srvm is and on top, you can test your
unprintable-
>redacting logic in the code.
>
>But all of that may be moot, if we take the "fewer customization at
runtime"
>approach.

On my box, uname -srvm = "NONSTOP_KERNEL L24 08 NSV-D". Is this going to
Break anything?


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 5/6] connect: advertise OS version
  2025-01-17 22:47       ` rsbecker
@ 2025-01-17 23:04         ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 23:04 UTC (permalink / raw)
  To: rsbecker
  Cc: 'Usman Akinyemi', git, christian.couder, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, sunshine,
	'Christian Couder'

<rsbecker@nexbridge.com> writes:

>>So the question is again, do we ever need to use os_version() that is a raw
> string
>>that may require sanitizing?  I do not think of any offhand.
>
> uname(2) is definitely not portable. uname(1) is almost always available,
> but
> there is no guarantee about uname(2). I am not entirely happy having my
> builds break if having to write one between rc0 and rc1 when this rolls. How
> is this being handled? os_version() is also not portable. What if we had
> something that asked for specific elements of the string, by name or id.

Sorry, I fail to see anything in your paragraph that is relevant to
what I said.  Especially os_version() is a function that is
implemented in the patchset, not something you would complain about
being "not portable".

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 22:49       ` rsbecker
@ 2025-01-17 23:06         ` Junio C Hamano
  2025-01-17 23:18           ` rsbecker
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-17 23:06 UTC (permalink / raw)
  To: rsbecker
  Cc: 'Usman Akinyemi', git, christian.couder, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, sunshine,
	'Christian Couder'

<rsbecker@nexbridge.com> writes:

> On my box, uname -srvm = "NONSTOP_KERNEL L24 08 NSV-D". Is this going to
> Break anything?

If you are happy with that string, then there is no need for
osversion.command configuration variable, is there?


^ permalink raw reply	[flat|nested] 108+ messages in thread

* RE: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 23:06         ` Junio C Hamano
@ 2025-01-17 23:18           ` rsbecker
  0 siblings, 0 replies; 108+ messages in thread
From: rsbecker @ 2025-01-17 23:18 UTC (permalink / raw)
  To: 'Junio C Hamano'
  Cc: 'Usman Akinyemi', git, christian.couder, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, sunshine,
	'Christian Couder'

On January 17, 2025 6:06 PM, Junio C Hamano wrote:
><rsbecker@nexbridge.com> writes:
>
>> On my box, uname -srvm = "NONSTOP_KERNEL L24 08 NSV-D". Is this going
>> to Break anything?
>
>If you are happy with that string, then there is no need for
osversion.command
>configuration variable, is there?

I am fine with that string. If that's what will work by default, it should
be fine.

Sorry about my confusion.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 1/6] version: refactor redact_non_printables()
  2025-01-17 18:26     ` Junio C Hamano
  2025-01-17 19:48       ` Junio C Hamano
@ 2025-01-20 17:10       ` Usman Akinyemi
  2025-01-21  8:12         ` Christian Couder
  1 sibling, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-20 17:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

On Fri, Jan 17, 2025 at 11:56 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > The git_user_agent_sanitized() function performs some sanitizing to
> > avoid special characters being sent over the line and possibly messing
> > up with the protocol or with the parsing on the other side.
> >
> > Let's extract this sanitizing into a new redact_non_printables() function,
> > as we will want to reuse it in a following patch.
> >
> > For now the new redact_non_printables() function is still static as
> > it's only needed locally.
> >
> > While at it, let's use strbuf_detach() to explicitly detach the string
> > contained by the 'buf' strbuf.
> >
> > Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> > Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> > ---
> >  version.c | 22 ++++++++++++++++------
> >  1 file changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/version.c b/version.c
> > index 4d763ab48d..78f025c808 100644
> > --- a/version.c
> > +++ b/version.c
> > @@ -6,6 +6,20 @@
> >  const char git_version_string[] = GIT_VERSION;
> >  const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
> >
> > +/*
> > + * Trim and replace each character with ascii code below 32 or above
> > + * 127 (included) using a dot '.' character.
>
> /*
>  * Trim and replace each byte outside ASCII printable
>  * (33 to 127, inclusive) with a dot '.'.
>  */
>
> perhaps?
This sounds confusing, it sounds like the byte we are replacing with dot are
in the range of 33 to 127 whereas, it is those outside these range.
>
> > + * TODO: ensure consecutive non-printable characters are only replaced once
>
> I am not sure what your plans are for this change.  Has the list
> reached the consensus to squish consecutive redaction dots into one
> in the user-agent string?  If not, let's not mention it.  Making an
> incompatible change to the user-agent string is not the primary aim
> of this topic anyway.
>
> > +*/
>
> Funny indentation.  The asterisk should have a SP before it, just
> like on the previous lines.
Mistake, thanks for catching it, will make a change to it in the next
patch series.
>
> > +static void redact_non_printables(struct strbuf *buf)
> > +{
> > +     strbuf_trim(buf);
> > +     for (size_t i = 0; i < buf->len; i++) {
> > +             if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
>
> <sane-ctype.h> defines isprint() we can use here.
I think it would be better to add this in another commit so that one commit
does one thing. I will add it after this patch series got settled,
what do you think ?
>
> > +                     buf->buf[i] = '.';
> > +     }
> > +}
>
> Do we want to do anything special when the resulting buf->buf[]
> becomes empty or just full of dots without anything else?  Should
> the caller be told about such a condition, or is it callers'
> responsibility to check if they care?  I am inclined to say that it
> is the latter.
I agreed.
>
> > @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
> >               struct strbuf buf = STRBUF_INIT;
> >
> >               strbuf_addstr(&buf, git_user_agent());
> > -             strbuf_trim(&buf);
> > -             for (size_t i = 0; i < buf.len; i++) {
> > -                     if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> > -                             buf.buf[i] = '.';
> > -             }
> > -             agent = buf.buf;
> > +             redact_non_printables(&buf);
> > +             agent = strbuf_detach(&buf, NULL);
> >       }
> >
> >       return agent;

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency
  2025-01-17 19:31     ` Junio C Hamano
@ 2025-01-20 17:32       ` Usman Akinyemi
  2025-01-20 19:52         ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-20 17:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

On Sat, Jan 18, 2025 at 1:02 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > -test_expect_success 'test capability advertisement' '
> > +test_expect_success 'setup to generate files with expected content' '
> > +     printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_osversion &&
>
> Is this required to be "printf" and not "echo", if so why?
>
> "git version" could contain any character if the builder gives a
> custom version string by saving it in the "version" file (we use the
> mechanism when we create a distribution tarball, for example).  What
> happens if it contains say "%s" or something?
There is not any requirement to use "printf" here, I did not think about
this case before, I will change it to "echo"
>
> If you _really_ need to use printf, you'd want to do so more like:
>
>         printf "agent=git/%s" "$(git version | cut ...)"
>
> Is it required that agent_and_osversion lack the terminating LF?
> The use of printf without terminating "\n" at the end of the format
> string hints the readers that it is the case.  If you did not intend
> that, perhaps doing
>
>         printf "agent=git/%s\n" "$(git version | cut ...)"
>
> would avoid misleading them.
Yeah, that is true, I could not notice this as the next commit of the
patch series
was able to fix it. I will change it to "echo", with this, it will be better.

Thank you.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 5/6] connect: advertise OS version
  2025-01-17 22:22     ` Junio C Hamano
  2025-01-17 22:47       ` rsbecker
@ 2025-01-20 18:15       ` Usman Akinyemi
  2025-01-21 19:06         ` Junio C Hamano
  1 sibling, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-20 18:15 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

On Sat, Jan 18, 2025 at 3:52 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > As some issues that can happen with a Git client can be operating system
> > specific, it can be useful for a server to know which OS a client is
> > using. In the same way it can be useful for a client to know which OS
> > a server is using.
>
> Hmph.  The other end may be running different version of Git, and
> the version difference of _our_ software is probably more relevant.
> For that matter, they may even be running something entirely
> different from our software, like Gerrit.  So I am not sure I am
> convinced that os-version thing is a good thing to have with that
> paragraph.
Hi Junio,

What could be a better way of describing this ? Also, user-agent capability is
already sharing the information about the version of Git.
>
> > Let's introduce a new protocol (`os-version`) allowing Git clients and
> > servers to exchange operating system information. The protocol is
> > controlled by the new `transfer.advertiseOSVersion` config option.
>
> The last sentence is redundant and can safely removed.  The next
> paragraph describes it better than "is controlled by".
Noted, I will do that.
>
> > Add the `transfer.advertiseOSVersion` config option to address
> > privacy concerns. It defaults to `true` and can be changed to
> > `false`. When enabled, this option makes clients and servers send each
> > other the OS name (e.g., "Linux" or "Windows"). The information is
> > retrieved using the 'sysname' field of the `uname(2)` system call.
>
> Add "or its equivalent" at the end.
>
> macOS may have one, but it probably is not quite correct to say that
> Windows have uname system call (otherwise we wouldn't be emulating
> it on top of GetVersion ourselves).
Yeah, noted. I will do that in the next iteration.
>
> > However, there are differences between `uname(1)` (command-line utility)
> > and `uname(2)` (system call) outputs on Windows. These discrepancies
> > complicate testing on Windows platforms. For example:
> >   - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
> >   .2024-02-14.20:17.UTC.x86_64
> >   - `uname(2)` output: Windows.10.0.20348
> >
> > On Windows, uname(2) is not actually system-supplied but is instead
> > already faked up by Git itself. We could have overcome the test issue
> > on Windows by implementing a new `uname` subcommand in `test-tool`
> > using uname(2), but except uname(2), which would be tested against
> > itself, there would be nothing platform specific, so it's just simpler
> > to disable the tests on Windows.
>
> OK.
>
> > +transfer.advertiseOSVersion::
> > +     When `true`, the `os-version` capability is advertised by clients and
> > +     servers. It makes clients and servers send to each other a string
> > +     representing the operating system name, like "Linux" or "Windows".
> > +     This string is retrieved from the `sysname` field of the struct returned
> > +     by the uname(2) system call. Defaults to true.
>
> Presumably, both ends of the connection independently choose whether
> they enable or disable this variable, so we have 2x2=4 combinations
> (here, versions of Git before the os-version capability support is
> introduced behave the same way as an installation with this
> configuration variable set to false).
>
> And among these four combinations, only one of them results in "send
> to each other", but the description above is fuzzy.
Yeah, describing the four combinations would better right ?
>
> > diff --git a/connect.c b/connect.c
> > index 10fad43e98..6d5792b63c 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
> >       if (server_supports_v2("agent"))
> >               packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
> >
> > +     if (server_supports_v2("os-version") && advertise_os_version(the_repository))
> > +             packet_write_fmt(fd_out, "os-version=%s", os_version_sanitized());
>
> Not a new problem, because the new code is pretty-much a straight
> copy from the existing "agent" code, but do we ever use unsanitized
> versions of git-user-agent and os-version?  If not, I am wondering
> if we should sanitize immediately when we obtain the raw string and
> keep it, get rid of _santized() function from the public API, and
> make anybody calling git_user_agent() and os_version() to get
> sanitized safe-to-use strings.
>
> I see http.c throws git_user_agent() without doing any sanitization
> at the cURL library, but it may be a mistake that we may want to fix
> (outside the scope of this topic).  Since the contrast between the
> os_version() vs the os_version_sanitized() is *new* in this series,
> however, we probably would want to get it right from the beginning.
>
> So the question is again, do we ever need to use os_version() that
> is a raw string that may require sanitizing?  I do not think of any
> offhand.
In this case, I guess there has to be a conclusion on what to do.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 21:44     ` Eric Sunshine
@ 2025-01-20 18:17       ` Usman Akinyemi
  2025-01-20 18:41         ` Eric Sunshine
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-20 18:17 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, rsbecker, Christian Couder

On Sat, Jan 18, 2025 at 3:14 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Fri, Jan 17, 2025 at 5:47 AM Usman Akinyemi
> <usmanakinyemi202@gmail.com> wrote:
> > Currently by default, the new `os-version` capability only exchange the
> > operating system name between servers and clients i.e "Linux" or
> > "Windows".
> >
> > Let's introduce a new configuration option, `osversion.command`, to handle
> > the string exchange between servers and clients. This option allows
> > customization of the exchanged string by leveraging the output of the
> > specified command. This customization might be especially useful on some
> > quite uncommon platforms like NonStop where interesting OS information is
> > available from other means than uname(2).
> >
> > If this new configuration option is not set, the `os-version` capability
> > exchanges just the operating system name.
> >
> > Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> > ---
> > diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
> > @@ -150,6 +150,34 @@ test_expect_success 'git upload-pack --advertise-refs: v2' '
> > +test_expect_success 'git upload-pack --advertise-refs: v2 with osVersion.command config set' '
> > +       test_config osVersion.command "uname -srvm" &&
> > +       printf "agent=FAKE" >agent_and_long_osversion &&
> > +
> > +       if test_have_prereq !WINDOWS
> > +       then
> > +               printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
> > +       fi &&
>
> As an aid to future readers, please add an explanation either in the
> commit message or as a comment here in the code explaining why Windows
> is being singled out as special.
>
Hi Eric,

The previous commit which introduced this has this information,
can we do some form of referencing ?

> > diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> > @@ -53,6 +53,35 @@ test_expect_success 'test capability advertisement' '
> > +test_expect_success 'test capability advertisement with osVersion.command config set' '
> > +       test_config osVersion.command "uname -srvm" &&
> > +       printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_long_osversion &&
> > +
> > +       if test_have_prereq !WINDOWS
> > +       then
> > +               printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
> > +       fi &&
>
> Ditto.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-20 18:17       ` Usman Akinyemi
@ 2025-01-20 18:41         ` Eric Sunshine
  2025-01-20 19:08           ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Eric Sunshine @ 2025-01-20 18:41 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, rsbecker, Christian Couder

On Mon, Jan 20, 2025 at 1:17 PM Usman Akinyemi
<usmanakinyemi202@gmail.com> wrote:
> On Sat, Jan 18, 2025 at 3:14 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > On Fri, Jan 17, 2025 at 5:47 AM Usman Akinyemi
> > <usmanakinyemi202@gmail.com> wrote:
> > > +       if test_have_prereq !WINDOWS
> > > +       then
> > > +               printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
> > > +       fi &&
> >
> > As an aid to future readers, please add an explanation either in the
> > commit message or as a comment here in the code explaining why Windows
> > is being singled out as special.
>
> The previous commit which introduced this has this information,
> can we do some form of referencing ?

My main concern is that someone looking at this change in the future
-- who did not have the benefit of reading the cover letter or the
review discussion -- may have a hard time understanding why Windows is
singled out by this patch. As long as you give some sort of
explanation, whether in the code or in the commit message, then you
save that future user from having to figure it out on his or her own.

So, your suggestion of referencing some other commit may work.
Augmenting the commit message of this patch with something along the
lines of:

   As with the previous commit, we skip the tests on Windows.

may be enough to tell the reader where to look for the explanation.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-17 22:33     ` Junio C Hamano
  2025-01-17 22:49       ` rsbecker
@ 2025-01-20 18:58       ` Usman Akinyemi
  2025-01-21 19:14         ` Junio C Hamano
  1 sibling, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-20 18:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

On Sat, Jan 18, 2025 at 4:03 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > Let's introduce a new configuration option, `osversion.command`, to handle
> > the string exchange between servers and clients. This option allows
> > customization of the exchanged string by leveraging the output of the
> > specified command. This customization might be especially useful on some
> > quite uncommon platforms like NonStop where interesting OS information is
> > available from other means than uname(2).
>
> After reading the above rationale, I doubt the usefulness of this
> feature even more.
>
> Shouldn't that kind of anomalies be handled by compat/ layer to make
> their uname(2) emulated, or allow get_uname_info() to be customized
> at compile time by platform implementations, to yield more useful
> pieces of information instead?
>
> That way, we do not need to add another mechanism that lets people
> spawn an arbitrary command while Git is running, we do not need to
> worry about security implications, and we do not need to worry about
> people abusing the facility to throw totally random and useless
> garbage information at the other end to make their stats useless.
Hi Junio,

Thanks for the review.
This config option was added at Randall's request.

Randall wrote:

"Instead of an override, what about a knob that specifies the uname
command to use to build the value. Personally, I would use `uname -s
-r -v` on NonStop to get the kernel version used in the build. The
difficulty on my platform is that this is not truly useful info. The
effective build OS compatibility version is in a #define
__L_Series_RVU and __H_Series_RVU, so the knob might be needed in
git_compat_util.h or similar. This comes from the compiler arguments,
which are not yet captured."

So, the difficulty is that the compile time information might not be useful.

This patch is the last patch of the series and can be a stand alone also.

Thank you.

>
> I'll skip the overly wide documentation changes.
>
> > diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
> > ...
> > diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> > ...
>
> > +test_expect_success 'test capability advertisement with osVersion.command config set' '
> > +     test_config osVersion.command "uname -srvm" &&
>
> If osversion.command configuration variable turns out to be
> acceptable addition, I do not think we want to use "uname -srvm" as
> its value for its test.  Do you know for sure how portable srvm is?
>
> If you use something like "printf ' \001a\011b\015\012c '", you do
> not even have to worry about how portable srvm is and on top, you
> can test your unprintable-redacting logic in the code.
>
> But all of that may be moot, if we take the "fewer customization at
> runtime" approach.
>
> Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-20 18:41         ` Eric Sunshine
@ 2025-01-20 19:08           ` Usman Akinyemi
  0 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-20 19:08 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: git, christian.couder, gitster, ps, johncai86,
	Johannes.Schindelin, me, phillip.wood, rsbecker, Christian Couder

On Tue, Jan 21, 2025 at 12:11 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Mon, Jan 20, 2025 at 1:17 PM Usman Akinyemi
> <usmanakinyemi202@gmail.com> wrote:
> > On Sat, Jan 18, 2025 at 3:14 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > > On Fri, Jan 17, 2025 at 5:47 AM Usman Akinyemi
> > > <usmanakinyemi202@gmail.com> wrote:
> > > > +       if test_have_prereq !WINDOWS
> > > > +       then
> > > > +               printf "\nos-version=%s\n" $(uname -srvm | test_redact_non_printables) >>agent_and_long_osversion
> > > > +       fi &&
> > >
> > > As an aid to future readers, please add an explanation either in the
> > > commit message or as a comment here in the code explaining why Windows
> > > is being singled out as special.
> >
> > The previous commit which introduced this has this information,
> > can we do some form of referencing ?
>
> My main concern is that someone looking at this change in the future
> -- who did not have the benefit of reading the cover letter or the
> review discussion -- may have a hard time understanding why Windows is
> singled out by this patch. As long as you give some sort of
> explanation, whether in the code or in the commit message, then you
> save that future user from having to figure it out on his or her own.
>
> So, your suggestion of referencing some other commit may work.
> Augmenting the commit message of this patch with something along the
> lines of:
>
>    As with the previous commit, we skip the tests on Windows.
>
> may be enough to tell the reader where to look for the explanation.
Yeah, thanks, this looks better and clearer. I will add this in the
next iteration
if we agree to include the osversion.command config.

Thank you.
Usman Akinyemi.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency
  2025-01-20 17:32       ` Usman Akinyemi
@ 2025-01-20 19:52         ` Junio C Hamano
  2025-01-21 13:43           ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-20 19:52 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Yeah, that is true, I could not notice this as the next commit of the
> patch series
> was able to fix it. I will change it to "echo", with this, it will be better.

If we want to prepare ourselves against any arbitrary garbage the
builder may throw at us, using printf with _fixed_ format and feed
the potentially arbitrary garbage as its parameter to be
interpolated is the safest approach, so writing it as

    printf "agent=git/%s\n" "$(git version | cut ...)"

would signal the readers that whoever wrote it knew what they were
doing and was being extra careful.

THanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 1/6] version: refactor redact_non_printables()
  2025-01-20 17:10       ` Usman Akinyemi
@ 2025-01-21  8:12         ` Christian Couder
  2025-01-21 18:01           ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Christian Couder @ 2025-01-21  8:12 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: Junio C Hamano, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

On Mon, Jan 20, 2025 at 6:10 PM Usman Akinyemi
<usmanakinyemi202@gmail.com> wrote:
>
> On Fri, Jan 17, 2025 at 11:56 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> > > +static void redact_non_printables(struct strbuf *buf)
> > > +{
> > > +     strbuf_trim(buf);
> > > +     for (size_t i = 0; i < buf->len; i++) {
> > > +             if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
> >
> > <sane-ctype.h> defines isprint() we can use here.
> I think it would be better to add this in another commit so that one commit
> does one thing. I will add it after this patch series got settled,
> what do you think ?

Alternatively it could be done in its own preparatory patch at the
beginning of this patch series.

<sane-ctype.h> has:

#define isprint(x) ((x) >= 0x20 && (x) <= 0x7e)

So if we wanted to use isprint() we would have to use something like:

    for (size_t i = 0; i < buf->len; i++) {
            if (!isprint(buf->buf[i]) || buf->buf[i] == ' ')
                    buf->buf[i] = '.';
    }

It would have been nicer if we didn't need a special case for SP. So I
would say it's likely a matter of taste if the result is nicer than
the original.

> >
> > > +                     buf->buf[i] = '.';
> > > +     }
> > > +}

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency
  2025-01-20 19:52         ` Junio C Hamano
@ 2025-01-21 13:43           ` Usman Akinyemi
  0 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-21 13:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

On Tue, Jan 21, 2025 at 1:22 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > Yeah, that is true, I could not notice this as the next commit of the
> > patch series
> > was able to fix it. I will change it to "echo", with this, it will be better.
>
> If we want to prepare ourselves against any arbitrary garbage the
> builder may throw at us, using printf with _fixed_ format and feed
> the potentially arbitrary garbage as its parameter to be
> interpolated is the safest approach, so writing it as
>
>     printf "agent=git/%s\n" "$(git version | cut ...)"
>
> would signal the readers that whoever wrote it knew what they were
> doing and was being extra careful.
>
> THanks.
Yeah, I will add this in the next iteration.
Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 1/6] version: refactor redact_non_printables()
  2025-01-21  8:12         ` Christian Couder
@ 2025-01-21 18:01           ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-21 18:01 UTC (permalink / raw)
  To: Christian Couder
  Cc: Usman Akinyemi, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> On Mon, Jan 20, 2025 at 6:10 PM Usman Akinyemi
> <usmanakinyemi202@gmail.com> wrote:
>>
>> On Fri, Jan 17, 2025 at 11:56 PM Junio C Hamano <gitster@pobox.com> wrote:
>> >
>> > Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
>> > > +static void redact_non_printables(struct strbuf *buf)
>> > > +{
>> > > +     strbuf_trim(buf);
>> > > +     for (size_t i = 0; i < buf->len; i++) {
>> > > +             if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
>> >
>> > <sane-ctype.h> defines isprint() we can use here.
>> I think it would be better to add this in another commit so that one commit
>> does one thing. I will add it after this patch series got settled,
>> what do you think ?
>
> Alternatively it could be done in its own preparatory patch at the
> beginning of this patch series.

Yup, a preliminary clean-up sounds fine, but so does a follow-up
after all the dust settles.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 5/6] connect: advertise OS version
  2025-01-20 18:15       ` Usman Akinyemi
@ 2025-01-21 19:06         ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-21 19:06 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> On Sat, Jan 18, 2025 at 3:52 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>>
>> > As some issues that can happen with a Git client can be operating system
>> > specific, it can be useful for a server to know which OS a client is
>> > using. In the same way it can be useful for a client to know which OS
>> > a server is using.
>>
>> Hmph.  The other end may be running different version of Git, and
>> the version difference of _our_ software is probably more relevant.
>> For that matter, they may even be running something entirely
>> different from our software, like Gerrit.  So I am not sure I am
>> convinced that os-version thing is a good thing to have with that
>> paragraph.
> Hi Junio,
>
> What could be a better way of describing this ? Also, user-agent capability is
> already sharing the information about the version of Git.

I dunno.  I only said that what you said does not convince me that
os-version is a good thing.  Try harder, perhaps, to be more
convincing?  After all it is your itch.

An alternative that may be conceptually cleaner is to encourage
people to include not just Git version but OS variant information in
the existing "agent" capability, making it easier to do (which
probably means an addition to configuration knob), and encourage
implementors of other Git-compatible software to also let their
systems identify themselves via the "agent" capability.

As we have documented that "agent" strings are purely informative,
there shouldn't be any problem if we started identifying the version
of Git running on one end as "git/2.47.0 Linux" (instead of
"git/2.47.0").

>> And among these four combinations, only one of them results in "send
>> to each other", but the description above is fuzzy.
> Yeah, describing the four combinations would better right ?

I do not think readers necessarily would want to hear about the four
combinations; a paragraph that makes it clear that the configuration
is independently set on either end of the connection will make it
obvious to them without being told.

>> So the question is again, do we ever need to use os_version() that
>> is a raw string that may require sanitizing?  I do not think of any
>> offhand.
> In this case, I guess there has to be a conclusion on what to do.

As I didn't hear any concrete use cases of the version string before
sanitizing, I would suggest to make os_version() to

 - ask for a string from the underlying system layer (like uname(2)
   or its emulation), 

 - immediately sanitize it,

 - cache the result from the above (just like the
   os_version_sanitized() in the posted patch does with a variable
   of type "static char *" in the function scope), and

 - keep returning that sanitized version to the callers.

to simplify the API surface.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-20 18:58       ` Usman Akinyemi
@ 2025-01-21 19:14         ` Junio C Hamano
  2025-01-21 19:51           ` rsbecker
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-21 19:14 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, rsbecker, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

>> That way, we do not need to add another mechanism that lets people
>> spawn an arbitrary command while Git is running, we do not need to
>> worry about security implications, and we do not need to worry about
>> people abusing the facility to throw totally random and useless
>> garbage information at the other end to make their stats useless.
>
> Thanks for the review.
> This config option was added at Randall's request.
>
> Randall wrote:
>
> "Instead of an override, what about a knob that specifies the uname
> command to use to build the value. Personally, I would use `uname -s
> -r -v` on NonStop to get the kernel version used in the build. The
> difficulty on my platform is that this is not truly useful info. The
> effective build OS compatibility version is in a #define
> __L_Series_RVU and __H_Series_RVU, so the knob might be needed in
> git_compat_util.h or similar. This comes from the compiler arguments,
> which are not yet captured."
>
> So, the difficulty is that the compile time information might not be useful.

It only tells us that uname(2) gives useless information on the
platform, but there are other ways to ask the system for more useful
information.  Isn't that the same deal with how useful information
is obtained from not uname(2), since a useful one does not exist
there, but from GetVersion() on mingw?  We do not have to spawn an
external process on MinGW to do this---we shouldn't have to do so on
NonStop, either.  We should be able to make a call into a NonStop
specific code you or Randal add in compat/ from get_uname_info()
to hide the platform-specific details, no?


^ permalink raw reply	[flat|nested] 108+ messages in thread

* RE: [PATCH v2 6/6] version: introduce osversion.command config for os-version output
  2025-01-21 19:14         ` Junio C Hamano
@ 2025-01-21 19:51           ` rsbecker
  0 siblings, 0 replies; 108+ messages in thread
From: rsbecker @ 2025-01-21 19:51 UTC (permalink / raw)
  To: 'Junio C Hamano', 'Usman Akinyemi'
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, sunshine, 'Christian Couder'

On January 21, 2025 2:14 PM, Junio C Hamano wrote:
>Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
>>> That way, we do not need to add another mechanism that lets people
>>> spawn an arbitrary command while Git is running, we do not need to
>>> worry about security implications, and we do not need to worry about
>>> people abusing the facility to throw totally random and useless
>>> garbage information at the other end to make their stats useless.
>>
>> Thanks for the review.
>> This config option was added at Randall's request.
>>
>> Randall wrote:
>>
>> "Instead of an override, what about a knob that specifies the uname
>> command to use to build the value. Personally, I would use `uname -s
>> -r -v` on NonStop to get the kernel version used in the build. The
>> difficulty on my platform is that this is not truly useful info. The
>> effective build OS compatibility version is in a #define
>> __L_Series_RVU and __H_Series_RVU, so the knob might be needed in
>> git_compat_util.h or similar. This comes from the compiler arguments,
>> which are not yet captured."
>>
>> So, the difficulty is that the compile time information might not be
useful.
>
>It only tells us that uname(2) gives useless information on the platform,
but there
>are other ways to ask the system for more useful information.  Isn't that
the same
>deal with how useful information is obtained from not uname(2), since a
useful one
>does not exist there, but from GetVersion() on mingw?  We do not have to
spawn
>an external process on MinGW to do this---we shouldn't have to do so on
NonStop,
>either.  We should be able to make a call into a NonStop specific code you
or Randal
>add in compat/ from get_uname_info() to hide the platform-specific details,
no?

I agree. One this series is finalized, I can put together a patch to obtain
OS details
on NonStop from proprietary calls. Not something I am happy about doing, but
it is what it is. I still do not get why people cannot just run 'uname -a'
instead of
this integration. From a support standpoint, knowing what OS level was used
in
the build is more useful that git telling me what I can get from uname. But
I
accept that others want this, so I'm going with it - once the code is
accepted
into base git.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                     ` (5 preceding siblings ...)
  2025-01-17 10:46   ` [PATCH v2 6/6] version: introduce osversion.command config for os-version output Usman Akinyemi
@ 2025-01-24 12:21   ` Usman Akinyemi
  2025-01-24 12:21     ` [PATCH v3 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
                       ` (6 more replies)
  6 siblings, 7 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-24 12:21 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	rsbecker, sunshine

For debugging, statistical analysis, and security purposes, it can
be valuable for Git servers to know the operating system the clients
are using.

For example:
- A server noticing that a client is using an old Git version with
security issues on one platform, like macOS, could verify if the
user is indeed running macOS before sending a message to upgrade."
- Similarly, a server identifying a client that could benefit from
an upgrade (e.g., for performance reasons) could better customize the
message it sends to nudge the client to upgrade.

So let's add a new 'os-version' capability to the v2 protocol, in the
same way as the existing 'agent' capability that lets clients and servers
exchange the Git version they are running.

Having the `os-version` protocol capability separately from other protocol
capabilities like `agent` is beneficial in ways like:

- It provides a clear separation between Git versioning and OS-specific,
concerns making troubleshooting and environment analysis more modular.
- It ensures we do not disrupt people's scripts that collect statistics
from other protocol capabilities like `agent`.
- It offers flexibility for possible future extensibility, allowing us to
add additional system-level details without modifying existing `agent`
parsing logic.
- It provides better control over privacy and security by allowing
selective exposure of OS information.

By default this sends similar info as `git bugreport` is already sending,
which uses uname(2). The difference is that it is sanitized in the same
way as the Git version sent by the 'agent' capability is sanitized
(by replacing characters having an ascii code less than 32 or more
than 127 with '.'). Also, it only sends the result of `uname -s` i.e
just only the operating system name (e.g "Linux").

Due to privacy issues and concerns, let's add the `transfer.advertiseOSVersion`
config option. This boolean option is enabled by default, but allows users to
disable this feature completely by setting it to "false".

Note that, due to differences between `uname(1)` (command-line
utility) and `uname(2)` (system call) outputs on Windows,
`transfer.advertiseOSVersion` is set to false on Windows during
testing. See the message part of patch 5/6 for more details.

My mentor, Christian Couder, sent a previous patch series about this
before. You can find it here
https://lore.kernel.org/git/20240619125708.3719150-1-christian.couder@gmail.com/

Changes since v2
================
  - Dropped the last patch which introduced `osversion.command`.
  - Use isprint() for checking printables byte in a preparatory
  patch. 
  - Add a few reasons why we should have `os-version` as a separate
  capability in the commit message that introduces it.
  - Improve how `printf` is used in the tests for better clarity.
  - Refactor documentation for improved clarity.
  - Retrieve and immediately sanitize the system information in the
  same function for simpler API surface.

Usman Akinyemi (6):
  version: replace manual ASCII checks with isprint() for clarity
  version: refactor redact_non_printables()
  version: refactor get_uname_info()
  version: extend get_uname_info() to hide system details
  t5701: add setup test to remove side-effect dependency
  connect: advertise OS version

 Documentation/config/transfer.txt |  7 +++
 Documentation/gitprotocol-v2.txt  | 17 +++++++
 builtin/bugreport.c               | 13 +-----
 connect.c                         |  3 ++
 serve.c                           | 14 ++++++
 t/t5555-http-smart-common.sh      | 10 ++++-
 t/t5701-git-serve.sh              | 30 +++++++++++--
 t/test-lib-functions.sh           |  8 ++++
 version.c                         | 73 ++++++++++++++++++++++++++++---
 version.h                         | 22 ++++++++++
 10 files changed, 175 insertions(+), 22 deletions(-)

Range-diff versus v2:

-:  ---------- > 1:  82b62c5e66 version: replace manual ASCII checks with isprint() for clarity
1:  97bccab6d5 ! 2:  0a7d7ce871 version: refactor redact_non_printables()
    @@ version.c
     +/*
     + * Trim and replace each character with ascii code below 32 or above
     + * 127 (included) using a dot '.' character.
    -+ * TODO: ensure consecutive non-printable characters are only replaced once
    -+*/
    ++ */
     +static void redact_non_printables(struct strbuf *buf)
     +{
     +	strbuf_trim(buf);
     +	for (size_t i = 0; i < buf->len; i++) {
    -+		if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
    ++		if (!isprint(buf->buf[i]) || buf->buf[i] == ' ')
     +			buf->buf[i] = '.';
     +	}
     +}
    @@ version.c: const char *git_user_agent_sanitized(void)
      		strbuf_addstr(&buf, git_user_agent());
     -		strbuf_trim(&buf);
     -		for (size_t i = 0; i < buf.len; i++) {
    --			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
    +-			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
     -				buf.buf[i] = '.';
     -		}
     -		agent = buf.buf;
2:  1f8a4024a4 ! 3:  0187db59a4 version: refactor get_uname_info()
    @@ builtin/bugreport.c: static void get_system_info(struct strbuf *sys_info)
     
      ## version.c ##
     @@
    - #include "version.h"
      #include "version-def.h"
      #include "strbuf.h"
    + #include "sane-ctype.h"
     +#include "gettext.h"
      
      const char git_version_string[] = GIT_VERSION;
3:  962b42702f ! 4:  d3a3573594 version: extend get_uname_info() to hide system details
    @@ Commit message
         version: extend get_uname_info() to hide system details
     
         Currently, get_uname_info() function provides the full OS information.
    -    In a follwing commit, we will need it to provide only the OS name.
    +    In a following commit, we will need it to provide only the OS name.
     
         Let's extend it to accept a "full" flag that makes it switch between
         providing full OS information and providing only the OS name.
4:  7f0ec75a0d ! 5:  d9edd2ffc8 t5701: add setup test to remove side-effect dependency
    @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      
     -test_expect_success 'test capability advertisement' '
     +test_expect_success 'setup to generate files with expected content' '
    -+	printf "agent=git/$(git version | cut -d" " -f3)" >agent_and_osversion &&
    ++	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_and_osversion &&
     +
      	test_oid_cache <<-EOF &&
      	wrong_algo sha1:sha256
    @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      	ls-refs=unborn
      	fetch=shallow wait-for-done
      	server-option
    -@@ t/t5701-git-serve.sh: test_expect_success 'test capability advertisement' '
    - 	cat >expect.trailer <<-EOF &&
    + 	object-format=$(test_oid algo)
    + 	EOF
    +-	cat >expect.trailer <<-EOF &&
    ++	cat >expect.trailer <<-EOF
      	0000
      	EOF
     +'
5:  007f8582d9 ! 6:  8a936b25f7 connect: advertise OS version
    @@ Commit message
         a server is using.
     
         Let's introduce a new protocol (`os-version`) allowing Git clients and
    -    servers to exchange operating system information. The protocol is
    -    controlled by the new `transfer.advertiseOSVersion` config option.
    +    servers to exchange operating system information.
    +
    +    Having the `os-version` protocol capability separately from other protocol
    +    capabilities like `agent` is beneficial in ways like:
    +
    +    - It provides a clear separation between Git versioning and OS-specific,
    +    concerns making troubleshooting and environment analysis more modular.
    +    - It ensures we do not disrupt people's scripts that collect statistics
    +    from other protocol like `agent`.
    +    - It offers flexibility for possible future extensibility, allowing us to
    +    add additional system-level details without modifying existing `agent`
    +    parsing logic.
    +    - It provides better control over privacy and security by allowing
    +    selective exposure of OS information.
     
         Add the `transfer.advertiseOSVersion` config option to address
         privacy concerns. It defaults to `true` and can be changed to
         `false`. When enabled, this option makes clients and servers send each
         other the OS name (e.g., "Linux" or "Windows"). The information is
    -    retrieved using the 'sysname' field of the `uname(2)` system call.
    +    retrieved using the 'sysname' field of the `uname(2)` system call or its
    +    equivalent.
     
         However, there are differences between `uname(1)` (command-line utility)
         and `uname(2)` (system call) outputs on Windows. These discrepancies
    @@ Documentation/config/transfer.txt: transfer.bundleURI::
      	servers. Defaults to false.
     +
     +transfer.advertiseOSVersion::
    -+	When `true`, the `os-version` capability is advertised by clients and
    -+	servers. It makes clients and servers send to each other a string
    -+	representing the operating system name, like "Linux" or "Windows".
    -+	This string is retrieved from the `sysname` field of the struct returned
    -+	by the uname(2) system call. Defaults to true.
    ++	When set to `true` on the server, the server will advertise its
    ++	`os-version` capability to the client. On the client side, if set
    ++	to `true`, it will advertise its `os-version` capability to the
    ++	server only if the server also advertises its `os-version` capability.
    ++	Defaults to true.
     
      ## Documentation/gitprotocol-v2.txt ##
     @@ Documentation/gitprotocol-v2.txt: printable ASCII characters except space (i.e., the byte range 32 < x <
    @@ Documentation/gitprotocol-v2.txt: printable ASCII characters except space (i.e.,
     +In the same way as the `agent` capability above, the server can
     +advertise the `os-version` capability to notify the client the
     +kind of operating system it is running on. The client may optionally
    -+send its own `os-version` capability, to notify the server the kind of
    -+operating system it is also running on in its request to the server
    ++send its own `os-version` capability, to notify the server the kind
    ++of operating system it is also running on in its request to the server
     +(but it MUST NOT do so if the server did not advertise the os-version
     +capability). The value of this capability may consist of ASCII printable
    -+characters(from 33 to 126 inclusive) and are typically made from the result of
    -+`uname -s`(OS name e.g Linux). The os-version capability can be disabled
    -+entirely by setting the `transfer.advertiseOSVersion` config option
    -+to `false`. The `os-version` strings are purely informative for
    ++characters(from 33 to 126 inclusive) and are typically made from the
    ++result of `uname -s`(OS name e.g Linux). The os-version capability can
    ++be disabled entirely by setting the `transfer.advertiseOSVersion` config
    ++option to `false`. The `os-version` strings are purely informative for
     +statistics and debugging purposes, and MUST NOT be used to
    -+programmatically assume the presence or absence of particular
    -+features.
    ++programmatically assume the presence or absence of particular features.
     +
      ls-refs
      ~~~~~~~
    @@ t/t5555-http-smart-common.sh: test_expect_success 'git receive-pack --advertise-
      '
      
      test_expect_success 'git upload-pack --advertise-refs: v2' '
    -+	printf "agent=FAKE" >agent_and_osversion &&
    ++	printf "agent=FAKE\n" >agent_and_osversion &&
     +	if test_have_prereq WINDOWS
     +	then
     +		git config transfer.advertiseOSVersion false
     +	else
    -+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
    ++		printf "os-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
     +	fi &&
     +
      	cat >expect <<-EOF &&
    @@ t/t5555-http-smart-common.sh: test_expect_success 'git receive-pack --advertise-
     
      ## t/t5701-git-serve.sh ##
     @@ t/t5701-git-serve.sh: test_expect_success 'setup to generate files with expected content' '
    - 	cat >expect.trailer <<-EOF &&
    + 	server-option
    + 	object-format=$(test_oid algo)
    + 	EOF
    +-	cat >expect.trailer <<-EOF
    ++	cat >expect.trailer <<-EOF &&
      	0000
      	EOF
     +
    @@ t/t5701-git-serve.sh: test_expect_success 'setup to generate files with expected
     +	then
     +		git config transfer.advertiseOSVersion false
     +	else
    -+		printf "\nos-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
    ++		printf "os-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
     +	fi &&
     +
     +	cat >expect_osversion.base <<-EOF
    @@ t/test-lib-functions.sh: test_trailing_hash () {
     +# Trim and replace each character with ascii code below 32 or above
     +# 127 (included) using a dot '.' character.
     +# Octal intervals \001-\040 and \177-\377
    -+# corresponds to decimal intervals 1-32 and 127-255
    ++# correspond to decimal intervals 1-32 and 127-255
     +test_redact_non_printables () {
     +    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
     +}
     
      ## version.c ##
     @@
    - #include "version-def.h"
      #include "strbuf.h"
    + #include "sane-ctype.h"
      #include "gettext.h"
     +#include "config.h"
      
    @@ version.c: int get_uname_info(struct strbuf *buf, unsigned int full)
      	return 0;
      }
     +
    -+const char *os_version(void)
    ++const char *os_version_sanitized(void)
     +{
     +	static const char *os = NULL;
     +
    @@ version.c: int get_uname_info(struct strbuf *buf, unsigned int full)
     +		struct strbuf buf = STRBUF_INIT;
     +
     +		get_uname_info(&buf, 0);
    ++		/* Sanitize the os information immediately */
    ++		redact_non_printables(&buf);
     +		os = strbuf_detach(&buf, NULL);
     +	}
     +
     +	return os;
     +}
     +
    -+const char *os_version_sanitized(void)
    -+{
    -+	static const char *os_sanitized = NULL;
    -+
    -+	if (!os_sanitized) {
    -+		struct strbuf buf = STRBUF_INIT;
    -+
    -+		strbuf_addstr(&buf, os_version());
    -+		redact_non_printables(&buf);
    -+		os_sanitized = strbuf_detach(&buf, NULL);
    -+	}
    -+
    -+	return os_sanitized;
    -+}
    -+
     +int advertise_os_version(struct repository *r)
     +{
     +	static int transfer_advertise_os_version = -1;
    @@ version.h: const char *git_user_agent_sanitized(void);
      int get_uname_info(struct strbuf *buf, unsigned int full);
      
     +/*
    -+  Retrieve and cache system information for subsequent calls.
    -+  Return a pointer to the cached system information string.
    -+*/
    -+const char *os_version(void);
    -+
    -+/*
    -+  Retrieve system information string from os_version(). Then
    -+  sanitize and cache it. Return a pointer to the sanitized
    -+  system information string.
    ++  Retrieve, sanitize and cache system information for subsequent
    ++  calls. Return a pointer to the sanitized system information
    ++  string.
     +*/
     +const char *os_version_sanitized(void);
     +
6:  10a07a3095 < -:  ---------- version: introduce osversion.command config for os-version output
-- 
2.48.0


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v3 1/6] version: replace manual ASCII checks with isprint() for clarity
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
@ 2025-01-24 12:21     ` Usman Akinyemi
  2025-01-24 18:13       ` Junio C Hamano
  2025-01-24 12:21     ` [PATCH v3 2/6] version: refactor redact_non_printables() Usman Akinyemi
                       ` (5 subsequent siblings)
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-24 12:21 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	rsbecker, sunshine, Christian Couder

Since the isprint() function checks for printable characters, let's
replace the existing hardcoded ASCII checks with it. However, since
the original checks also handled spaces, we need to account for spaces
explicitly in the new check.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/version.c b/version.c
index 4d763ab48d..6cfbb8ca56 100644
--- a/version.c
+++ b/version.c
@@ -2,6 +2,7 @@
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
+#include "sane-ctype.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -29,7 +30,7 @@ const char *git_user_agent_sanitized(void)
 		strbuf_addstr(&buf, git_user_agent());
 		strbuf_trim(&buf);
 		for (size_t i = 0; i < buf.len; i++) {
-			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
+			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
 				buf.buf[i] = '.';
 		}
 		agent = buf.buf;
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v3 2/6] version: refactor redact_non_printables()
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  2025-01-24 12:21     ` [PATCH v3 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
@ 2025-01-24 12:21     ` Usman Akinyemi
  2025-01-24 12:21     ` [PATCH v3 3/6] version: refactor get_uname_info() Usman Akinyemi
                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-24 12:21 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	rsbecker, sunshine, Christian Couder

The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's use strbuf_detach() to explicitly detach the string
contained by the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/version.c b/version.c
index 6cfbb8ca56..60df71fd0e 100644
--- a/version.c
+++ b/version.c
@@ -7,6 +7,19 @@
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ */
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (!isprint(buf->buf[i]) || buf->buf[i] == ' ')
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -28,12 +41,8 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v3 3/6] version: refactor get_uname_info()
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
  2025-01-24 12:21     ` [PATCH v3 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
  2025-01-24 12:21     ` [PATCH v3 2/6] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-01-24 12:21     ` Usman Akinyemi
  2025-01-24 12:21     ` [PATCH v3 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-24 12:21 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	rsbecker, sunshine, Christian Couder

Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c | 13 ++-----------
 version.c           | 20 ++++++++++++++++++++
 version.h           |  7 +++++++
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 7c2df035c9..5e13d532a8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -12,10 +12,10 @@
 #include "diagnose.h"
 #include "object-file.h"
 #include "setup.h"
+#include "version.h"
 
 static void get_system_info(struct strbuf *sys_info)
 {
-	struct utsname uname_info;
 	char *shell = NULL;
 
 	/* get git version from native cmd */
@@ -24,16 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	if (uname(&uname_info))
-		strbuf_addf(sys_info, _("uname() failed with error '%s' (%d)\n"),
-			    strerror(errno),
-			    errno);
-	else
-		strbuf_addf(sys_info, "%s %s %s %s\n",
-			    uname_info.sysname,
-			    uname_info.release,
-			    uname_info.version,
-			    uname_info.machine);
+	get_uname_info(sys_info);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 60df71fd0e..3ec8b8243d 100644
--- a/version.c
+++ b/version.c
@@ -3,6 +3,7 @@
 #include "version-def.h"
 #include "strbuf.h"
 #include "sane-ctype.h"
+#include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -47,3 +48,22 @@ const char *git_user_agent_sanitized(void)
 
 	return agent;
 }
+
+int get_uname_info(struct strbuf *buf)
+{
+	struct utsname uname_info;
+
+	if (uname(&uname_info)) {
+		strbuf_addf(buf, _("uname() failed with error '%s' (%d)\n"),
+			    strerror(errno),
+			    errno);
+		return -1;
+	}
+
+	strbuf_addf(buf, "%s %s %s %s\n",
+		    uname_info.sysname,
+		    uname_info.release,
+		    uname_info.version,
+		    uname_info.machine);
+	return 0;
+}
diff --git a/version.h b/version.h
index 7c62e80577..afe3dbbab7 100644
--- a/version.h
+++ b/version.h
@@ -7,4 +7,11 @@ extern const char git_built_from_commit_string[];
 const char *git_user_agent(void);
 const char *git_user_agent_sanitized(void);
 
+/*
+  Try to get information about the system using uname(2).
+  Return -1 and put an error message into 'buf' in case of uname()
+  error. Return 0 and put uname info into 'buf' otherwise.
+*/
+int get_uname_info(struct strbuf *buf);
+
 #endif /* VERSION_H */
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v3 4/6] version: extend get_uname_info() to hide system details
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                       ` (2 preceding siblings ...)
  2025-01-24 12:21     ` [PATCH v3 3/6] version: refactor get_uname_info() Usman Akinyemi
@ 2025-01-24 12:21     ` Usman Akinyemi
  2025-01-24 12:21     ` [PATCH v3 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-24 12:21 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	rsbecker, sunshine, Christian Couder

Currently, get_uname_info() function provides the full OS information.
In a following commit, we will need it to provide only the OS name.

Let's extend it to accept a "full" flag that makes it switch between
providing full OS information and providing only the OS name.

We may need to refactor this function in the future if an
`osVersion.format` is added.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c |  2 +-
 version.c           | 16 +++++++++-------
 version.h           |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 5e13d532a8..e3288a86c8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -24,7 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	get_uname_info(sys_info);
+	get_uname_info(sys_info, 1);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 3ec8b8243d..d95221a72a 100644
--- a/version.c
+++ b/version.c
@@ -49,7 +49,7 @@ const char *git_user_agent_sanitized(void)
 	return agent;
 }
 
-int get_uname_info(struct strbuf *buf)
+int get_uname_info(struct strbuf *buf, unsigned int full)
 {
 	struct utsname uname_info;
 
@@ -59,11 +59,13 @@ int get_uname_info(struct strbuf *buf)
 			    errno);
 		return -1;
 	}
-
-	strbuf_addf(buf, "%s %s %s %s\n",
-		    uname_info.sysname,
-		    uname_info.release,
-		    uname_info.version,
-		    uname_info.machine);
+	if (full)
+		strbuf_addf(buf, "%s %s %s %s\n",
+			    uname_info.sysname,
+			    uname_info.release,
+			    uname_info.version,
+			    uname_info.machine);
+	else
+	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
diff --git a/version.h b/version.h
index afe3dbbab7..5eb586c0bd 100644
--- a/version.h
+++ b/version.h
@@ -12,6 +12,6 @@ const char *git_user_agent_sanitized(void);
   Return -1 and put an error message into 'buf' in case of uname()
   error. Return 0 and put uname info into 'buf' otherwise.
 */
-int get_uname_info(struct strbuf *buf);
+int get_uname_info(struct strbuf *buf, unsigned int full);
 
 #endif /* VERSION_H */
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v3 5/6] t5701: add setup test to remove side-effect dependency
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                       ` (3 preceding siblings ...)
  2025-01-24 12:21     ` [PATCH v3 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
@ 2025-01-24 12:21     ` Usman Akinyemi
  2025-01-24 18:12       ` Junio C Hamano
  2025-01-24 12:21     ` [PATCH v3 6/6] connect: advertise OS version Usman Akinyemi
  2025-01-24 18:39     ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Junio C Hamano
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-24 12:21 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	rsbecker, sunshine, Christian Couder

Currently, the "test capability advertisement" test creates some files
with expected content which are used by other tests below it.

To remove that side-effect from this test, let's split up part of
it into a "setup"-type test which creates the files with expected content
which gets reused by multiple tests. This will be useful in a following
commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 t/t5701-git-serve.sh | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index de904c1655..9394235fa0 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -7,22 +7,28 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
-test_expect_success 'test capability advertisement' '
+test_expect_success 'setup to generate files with expected content' '
+	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_and_osversion &&
+
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
+
 	cat >expect.base <<-EOF &&
 	version 2
-	agent=git/$(git version | cut -d" " -f3)
+	$(cat agent_and_osversion)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
 	object-format=$(test_oid algo)
 	EOF
-	cat >expect.trailer <<-EOF &&
+	cat >expect.trailer <<-EOF
 	0000
 	EOF
+'
+
+test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v3 6/6] connect: advertise OS version
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                       ` (4 preceding siblings ...)
  2025-01-24 12:21     ` [PATCH v3 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-01-24 12:21     ` Usman Akinyemi
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  2025-01-24 18:39     ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Junio C Hamano
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-24 12:21 UTC (permalink / raw)
  To: git, christian.couder
  Cc: gitster, ps, johncai86, Johannes.Schindelin, me, phillip.wood,
	rsbecker, sunshine, Christian Couder

As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Let's introduce a new protocol (`os-version`) allowing Git clients and
servers to exchange operating system information.

Having the `os-version` protocol capability separately from other protocol
capabilities like `agent` is beneficial in ways like:

- It provides a clear separation between Git versioning and OS-specific,
concerns making troubleshooting and environment analysis more modular.
- It ensures we do not disrupt people's scripts that collect statistics
from other protocol capabilities like `agent`.
- It offers flexibility for possible future extensibility, allowing us to
add additional system-level details without modifying existing `agent`
parsing logic.
- It provides better control over privacy and security by allowing
selective exposure of OS information.

Add the `transfer.advertiseOSVersion` config option to address
privacy concerns. It defaults to `true` and can be changed to
`false`. When enabled, this option makes clients and servers send each
other the OS name (e.g., "Linux" or "Windows"). The information is
retrieved using the 'sysname' field of the `uname(2)` system call or its
equivalent.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

On Windows, uname(2) is not actually system-supplied but is instead
already faked up by Git itself. We could have overcome the test issue
on Windows by implementing a new `uname` subcommand in `test-tool`
using uname(2), but except uname(2), which would be tested against
itself, there would be nothing platform specific, so it's just simpler
to disable the tests on Windows.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/config/transfer.txt |  7 +++++++
 Documentation/gitprotocol-v2.txt  | 17 +++++++++++++++++
 connect.c                         |  3 +++
 serve.c                           | 14 ++++++++++++++
 t/t5555-http-smart-common.sh      | 10 +++++++++-
 t/t5701-git-serve.sh              | 22 +++++++++++++++++++---
 t/test-lib-functions.sh           |  8 ++++++++
 version.c                         | 29 +++++++++++++++++++++++++++++
 version.h                         | 15 +++++++++++++++
 9 files changed, 121 insertions(+), 4 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index f1ce50f4a6..016eb27430 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -125,3 +125,10 @@ transfer.bundleURI::
 transfer.advertiseObjectInfo::
 	When `true`, the `object-info` capability is advertised by
 	servers. Defaults to false.
+
+transfer.advertiseOSVersion::
+	When set to `true` on the server, the server will advertise its
+	`os-version` capability to the client. On the client side, if set
+	to `true`, it will advertise its `os-version` capability to the
+	server only if the server also advertises its `os-version` capability.
+	Defaults to true.
diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 1652fef3ae..62f7ae3423 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -190,6 +190,23 @@ printable ASCII characters except space (i.e., the byte range 32 < x <
 and debugging purposes, and MUST NOT be used to programmatically assume
 the presence or absence of particular features.
 
+os-version
+~~~~~~~~~~
+
+In the same way as the `agent` capability above, the server can
+advertise the `os-version` capability to notify the client the
+kind of operating system it is running on. The client may optionally
+send its own `os-version` capability, to notify the server the kind
+of operating system it is also running on in its request to the server
+(but it MUST NOT do so if the server did not advertise the os-version
+capability). The value of this capability may consist of ASCII printable
+characters(from 33 to 126 inclusive) and are typically made from the
+result of `uname -s`(OS name e.g Linux). The os-version capability can
+be disabled entirely by setting the `transfer.advertiseOSVersion` config
+option to `false`. The `os-version` strings are purely informative for
+statistics and debugging purposes, and MUST NOT be used to
+programmatically assume the presence or absence of particular features.
+
 ls-refs
 ~~~~~~~
 
diff --git a/connect.c b/connect.c
index 10fad43e98..6d5792b63c 100644
--- a/connect.c
+++ b/connect.c
@@ -492,6 +492,9 @@ static void send_capabilities(int fd_out, struct packet_reader *reader)
 	if (server_supports_v2("agent"))
 		packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
 
+	if (server_supports_v2("os-version") && advertise_os_version(the_repository))
+		packet_write_fmt(fd_out, "os-version=%s", os_version_sanitized());
+
 	if (server_feature_v2("object-format", &hash_name)) {
 		int hash_algo = hash_algo_by_name(hash_name);
 		if (hash_algo == GIT_HASH_UNKNOWN)
diff --git a/serve.c b/serve.c
index c8694e3751..5b0d54ae9a 100644
--- a/serve.c
+++ b/serve.c
@@ -31,6 +31,16 @@ static int agent_advertise(struct repository *r UNUSED,
 	return 1;
 }
 
+static int os_version_advertise(struct repository *r,
+			   struct strbuf *value)
+{
+	if (!advertise_os_version(r))
+		return 0;
+	if (value)
+		strbuf_addstr(value, os_version_sanitized());
+	return 1;
+}
+
 static int object_format_advertise(struct repository *r,
 				   struct strbuf *value)
 {
@@ -123,6 +133,10 @@ static struct protocol_capability capabilities[] = {
 		.name = "agent",
 		.advertise = agent_advertise,
 	},
+	{
+		.name = "os-version",
+		.advertise = os_version_advertise,
+	},
 	{
 		.name = "ls-refs",
 		.advertise = ls_refs_advertise,
diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
index e47ea1ad10..b1af37a4a2 100755
--- a/t/t5555-http-smart-common.sh
+++ b/t/t5555-http-smart-common.sh
@@ -123,9 +123,17 @@ test_expect_success 'git receive-pack --advertise-refs: v1' '
 '
 
 test_expect_success 'git upload-pack --advertise-refs: v2' '
+	printf "agent=FAKE\n" >agent_and_osversion &&
+	if test_have_prereq WINDOWS
+	then
+		git config transfer.advertiseOSVersion false
+	else
+		printf "os-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
+	fi &&
+
 	cat >expect <<-EOF &&
 	version 2
-	agent=FAKE
+	$(cat agent_and_osversion)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 9394235fa0..2616132b95 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -23,13 +23,29 @@ test_expect_success 'setup to generate files with expected content' '
 	server-option
 	object-format=$(test_oid algo)
 	EOF
-	cat >expect.trailer <<-EOF
+	cat >expect.trailer <<-EOF &&
 	0000
 	EOF
+
+	if test_have_prereq WINDOWS
+	then
+		git config transfer.advertiseOSVersion false
+	else
+		printf "os-version=%s\n" $(uname -s | test_redact_non_printables) >>agent_and_osversion
+	fi &&
+
+	cat >expect_osversion.base <<-EOF
+	version 2
+	$(cat agent_and_osversion)
+	ls-refs=unborn
+	fetch=shallow wait-for-done
+	server-option
+	object-format=$(test_oid algo)
+	EOF
 '
 
 test_expect_success 'test capability advertisement' '
-	cat expect.base expect.trailer >expect &&
+	cat expect_osversion.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
@@ -357,7 +373,7 @@ test_expect_success 'test capability advertisement with uploadpack.advertiseBund
 	cat >expect.extra <<-EOF &&
 	bundle-uri
 	EOF
-	cat expect.base \
+	cat expect_osversion.base \
 	    expect.extra \
 	    expect.trailer >expect &&
 
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..3465904323 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -2007,3 +2007,11 @@ test_trailing_hash () {
 		test-tool hexdump |
 		sed "s/ //g"
 }
+
+# Trim and replace each character with ascii code below 32 or above
+# 127 (included) using a dot '.' character.
+# Octal intervals \001-\040 and \177-\377
+# correspond to decimal intervals 1-32 and 127-255
+test_redact_non_printables () {
+    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
+}
diff --git a/version.c b/version.c
index d95221a72a..083154a6cb 100644
--- a/version.c
+++ b/version.c
@@ -4,6 +4,7 @@
 #include "strbuf.h"
 #include "sane-ctype.h"
 #include "gettext.h"
+#include "config.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -69,3 +70,31 @@ int get_uname_info(struct strbuf *buf, unsigned int full)
 	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
+
+const char *os_version_sanitized(void)
+{
+	static const char *os = NULL;
+
+	if (!os) {
+		struct strbuf buf = STRBUF_INIT;
+
+		get_uname_info(&buf, 0);
+		/* Sanitize the os information immediately */
+		redact_non_printables(&buf);
+		os = strbuf_detach(&buf, NULL);
+	}
+
+	return os;
+}
+
+int advertise_os_version(struct repository *r)
+{
+	static int transfer_advertise_os_version = -1;
+
+	if (transfer_advertise_os_version == -1) {
+		repo_config_get_bool(r, "transfer.advertiseosversion", &transfer_advertise_os_version);
+		/* enabled by default */
+		transfer_advertise_os_version = !!transfer_advertise_os_version;
+	}
+	return transfer_advertise_os_version;
+}
diff --git a/version.h b/version.h
index 5eb586c0bd..300ee73df5 100644
--- a/version.h
+++ b/version.h
@@ -1,6 +1,8 @@
 #ifndef VERSION_H
 #define VERSION_H
 
+struct repository;
+
 extern const char git_version_string[];
 extern const char git_built_from_commit_string[];
 
@@ -14,4 +16,17 @@ const char *git_user_agent_sanitized(void);
 */
 int get_uname_info(struct strbuf *buf, unsigned int full);
 
+/*
+  Retrieve, sanitize and cache system information for subsequent
+  calls. Return a pointer to the sanitized system information
+  string.
+*/
+const char *os_version_sanitized(void);
+
+/*
+  Retrieve and cache whether os-version capability is enabled.
+  Return 1 if enabled, 0 if disabled.
+*/
+int advertise_os_version(struct repository *r);
+
 #endif /* VERSION_H */
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 5/6] t5701: add setup test to remove side-effect dependency
  2025-01-24 12:21     ` [PATCH v3 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-01-24 18:12       ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-24 18:12 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Currently, the "test capability advertisement" test creates some files
> with expected content which are used by other tests below it.
>
> To remove that side-effect from this test, let's split up part of
> it into a "setup"-type test which creates the files with expected content
> which gets reused by multiple tests. This will be useful in a following
> commit.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
>  t/t5701-git-serve.sh | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)

Nice clean-up.

>
> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> index de904c1655..9394235fa0 100755
> --- a/t/t5701-git-serve.sh
> +++ b/t/t5701-git-serve.sh
> @@ -7,22 +7,28 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>  
>  . ./test-lib.sh
>  
> -test_expect_success 'test capability advertisement' '
> +test_expect_success 'setup to generate files with expected content' '
> +	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_and_osversion &&
> +
>  	test_oid_cache <<-EOF &&
>  	wrong_algo sha1:sha256
>  	wrong_algo sha256:sha1
>  	EOF
> +
>  	cat >expect.base <<-EOF &&
>  	version 2
> -	agent=git/$(git version | cut -d" " -f3)
> +	$(cat agent_and_osversion)
>  	ls-refs=unborn
>  	fetch=shallow wait-for-done
>  	server-option
>  	object-format=$(test_oid algo)
>  	EOF
> -	cat >expect.trailer <<-EOF &&
> +	cat >expect.trailer <<-EOF
>  	0000
>  	EOF
> +'
> +
> +test_expect_success 'test capability advertisement' '
>  	cat expect.base expect.trailer >expect &&
>  
>  	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 1/6] version: replace manual ASCII checks with isprint() for clarity
  2025-01-24 12:21     ` [PATCH v3 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
@ 2025-01-24 18:13       ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-24 18:13 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Since the isprint() function checks for printable characters, let's
> replace the existing hardcoded ASCII checks with it. However, since
> the original checks also handled spaces, we need to account for spaces
> explicitly in the new check.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
>  version.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Thanks.  Nicely done as a separate step.

> diff --git a/version.c b/version.c
> index 4d763ab48d..6cfbb8ca56 100644
> --- a/version.c
> +++ b/version.c
> @@ -2,6 +2,7 @@
>  #include "version.h"
>  #include "version-def.h"
>  #include "strbuf.h"
> +#include "sane-ctype.h"
>  
>  const char git_version_string[] = GIT_VERSION;
>  const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
> @@ -29,7 +30,7 @@ const char *git_user_agent_sanitized(void)
>  		strbuf_addstr(&buf, git_user_agent());
>  		strbuf_trim(&buf);
>  		for (size_t i = 0; i < buf.len; i++) {
> -			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> +			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
>  				buf.buf[i] = '.';
>  		}
>  		agent = buf.buf;

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
                       ` (5 preceding siblings ...)
  2025-01-24 12:21     ` [PATCH v3 6/6] connect: advertise OS version Usman Akinyemi
@ 2025-01-24 18:39     ` Junio C Hamano
  2025-01-27 13:38       ` Christian Couder
  6 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-24 18:39 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> For debugging, statistical analysis, and security purposes, it can
> be valuable for Git servers to know the operating system the clients
> are using.

OK.  I think the reorganization done in this round makes it much
easier to see what is going on in each step.  Very well done.

The only remaining issue from my point of view is if we really want
this as a separate and new knob with capability, or if we would be
better off to carry this kind of extra piece of information by
enhancing existing "agent" capability.  Given what Web Browsers do
in their UA strings, it does feel cumbersome for analitics tools to
pay attention to two separate input sources (os-version and agent).

Has somebody brought up any downsides of cramming the OS information
to the existing agent thing?  I have not thought of any possible
downsides since I made this suggestion in a previous review of this
topic, but I may be missing something obvious, so...

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-24 18:39     ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Junio C Hamano
@ 2025-01-27 13:38       ` Christian Couder
  2025-01-27 15:26         ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Christian Couder @ 2025-01-27 13:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Usman Akinyemi, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

On Fri, Jan 24, 2025 at 7:39 PM Junio C Hamano <gitster@pobox.com> wrote:

> The only remaining issue from my point of view is if we really want
> this as a separate and new knob with capability, or if we would be
> better off to carry this kind of extra piece of information by
> enhancing existing "agent" capability.  Given what Web Browsers do
> in their UA strings, it does feel cumbersome for analitics tools to
> pay attention to two separate input sources (os-version and agent).
>
> Has somebody brought up any downsides of cramming the OS information
> to the existing agent thing?  I have not thought of any possible
> downsides since I made this suggestion in a previous review of this
> topic, but I may be missing something obvious, so...

My opinion is that it isn't a good idea to enhance the existing
"agent" capability. Yeah, it goes in the same direction as what web
browsers have been doing with the User-Agent header, but I think web
browsers are an especially bad example that we should strive not to
follow.

According to Wikipedia
(https://en.wikipedia.org/wiki/User-Agent_header) the format for the
User-Agent header is now "Mozilla/[version] ([system and browser
information]) [platform] ([platform details]) [extensions]", for
example "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us)
AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405". This is
obviously very difficult to parse for everyone including analytics
tools and is not very flexible either. It serves as a way to pass
information about available features, but leak some privacy
information in the process. The fact that it's used to pass
information about available features has led to a lot of user agent
spoofing which means that analytics, statistics and debugging are
likely harder than they need to be.

When Git developed capabilities and the "agent" capability, the doc
took care of saying things that it "MUST NOT be used to
programmatically assume the presence or absence of particular
features". This was done to go in the direction of not passing more
information through this "agent" capability but instead use separate
ones. So I think we should just avoid putting other things in the
"agent"  capability to avoid what happened to the User-Agent header in
browsers and to stay true to our original intent to have a different
capability for each advertised information or feature.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v4 2/6] version: refactor redact_non_printables()
  2025-01-27 15:16 ` [PATCH v4 0/6] " Christian Couder
@ 2025-01-27 15:16   ` Christian Couder
  0 siblings, 0 replies; 108+ messages in thread
From: Christian Couder @ 2025-01-27 15:16 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Patrick Steinhardt, Taylor Blau, Eric Sunshine,
	Karthik Nayak, Kristoffer Haugsbakk, brian m . carlson,
	Randall S . Becker, Usman Akinyemi, Christian Couder

From: Usman Akinyemi <usmanakinyemi202@gmail.com>

The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's use strbuf_detach() to explicitly detach the string
contained by the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/version.c b/version.c
index c9192a5beb..4f37b4499d 100644
--- a/version.c
+++ b/version.c
@@ -12,6 +12,19 @@
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ */
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (!isprint(buf->buf[i]) || buf->buf[i] == ' ')
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -33,12 +46,8 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;
-- 
2.46.0.rc0.95.gcbf174a634


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-27 13:38       ` Christian Couder
@ 2025-01-27 15:26         ` Junio C Hamano
  2025-01-31 14:30           ` Christian Couder
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-01-27 15:26 UTC (permalink / raw)
  To: Christian Couder
  Cc: Usman Akinyemi, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

Christian Couder <christian.couder@gmail.com> writes:

> information in the process. The fact that it's used to pass
> information about available features has led to a lot of user agent
> spoofing which means that analytics, statistics and debugging are
> likely harder than they need to be.

Yes, that is a valid viewpoint, but ...

> When Git developed capabilities and the "agent" capability, the doc
> took care of saying things that it "MUST NOT be used to
> programmatically assume the presence or absence of particular
> features".

... the proposed os-version thing has the same wording in its
documentation, doesn't it?  What is being added is not to be used
in a way that requires parsing and trusting the result.

So unless your point is that users (like those who parse User-Agent
string by browsers) will do the wrong thing and assume these strings
are usable for feature detection anyway so we should make it easier
to parse, I'd have to disagree.  If we are not aiming to make it
easier to parse and assume certain things that we do not want them
to, I do not see why we need to have the pieces of information in
two separate capabilities.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-27 15:26         ` Junio C Hamano
@ 2025-01-31 14:30           ` Christian Couder
  2025-01-31 16:37             ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Christian Couder @ 2025-01-31 14:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Usman Akinyemi, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

On Mon, Jan 27, 2025 at 4:26 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > information in the process. The fact that it's used to pass
> > information about available features has led to a lot of user agent
> > spoofing which means that analytics, statistics and debugging are
> > likely harder than they need to be.
>
> Yes, that is a valid viewpoint, but ...
>
> > When Git developed capabilities and the "agent" capability, the doc
> > took care of saying things that it "MUST NOT be used to
> > programmatically assume the presence or absence of particular
> > features".
>
> ... the proposed os-version thing has the same wording in its
> documentation, doesn't it?

Yeah, we repeat it to make sure that users read it. I am fine with
refactoring that wording if we think that having it once is enough.

> What is being added is not to be used
> in a way that requires parsing and trusting the result.

Why not? If server people want to do OS stats on their clients, for
example, why shouldn't they parse and trust the result?

> So unless your point is that users (like those who parse User-Agent
> string by browsers) will do the wrong thing and assume these strings
> are usable for feature detection anyway so we should make it easier
> to parse, I'd have to disagree.

We should make it easy to parse because people will use this field
(otherwise why are we adding it?), and we want to make it easy to use
rather than hard just because we are nice with our users.

I think we should not assume that they will do the wrong thing,
especially if our docs are clear about how it shouldn't be used.

>  If we are not aiming to make it
> easier to parse and assume certain things that we do not want them
> to, I do not see why we need to have the pieces of information in
> two separate capabilities.

I think it's just the right thing to make it easy to parse. Doing OS
stats on the server side doesn't need to be unnecessarily hard.

By the way, if we put the OS information in the "agent" capability,
how do we separate it from the existing "package/version" content and
make it easy to parse? I don't see a good solution because
GIT_USER_AGENT could be used, and the config option to not show the OS
name could be used too.

Also we don't know what could be in the "version" part. The doc says
that the agent part is typically of the form "package/version" but
doesn't require it.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-31 14:30           ` Christian Couder
@ 2025-01-31 16:37             ` Junio C Hamano
  2025-01-31 19:42               ` Usman Akinyemi
  2025-01-31 19:46               ` Usman Akinyemi
  0 siblings, 2 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-31 16:37 UTC (permalink / raw)
  To: Christian Couder
  Cc: Usman Akinyemi, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

Christian Couder <christian.couder@gmail.com> writes:

> By the way, if we put the OS information in the "agent" capability,
> how do we separate it from the existing "package/version" content and
> make it easy to parse?

Do NOT parse, period.

If three "things" that talk the Git protocol on the other end of the
connection gives "Linux git/2.48.0", and "macOS libgit2/1.9.0", and
"Windows git/2.47.1" as their (enhanced) "agent" strings, there is
no "ah, this one is 1.9.0 which way older than 2.47.1 so it must be
missing features X and Y" the users of the information are allowed
to infer.

Just take it as a single opaque string, and group identical ones.

In the above scenario, we found three different kinds now.  Maybe
we'll accumulate the counts and notice that there are N times as
many connections whose agent string begins with "Windows" as "Linux"
and "macOS" combined or something.  That would be an offline
analysis, and forcing users to do the stats offline would reduce the
temptation to use it for purposes other than its intended one.

You may find "ImNotTellingYou" and may wonder what OS the user is
really using, but they do not want to tell you, so you honor their
wish.

> I don't see a good solution because
> GIT_USER_AGENT could be used, and the config option to not show the OS
> name could be used too.

That is a good privacy measure.

> Also we don't know what could be in the "version" part. The doc says
> that the agent part is typically of the form "package/version" but
> doesn't require it.

Exactly.  I would think it is a feature, and the way to treat the
string in line with the philosophy behind that feature is to take it
as a single opaque thing.



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-31 16:37             ` Junio C Hamano
@ 2025-01-31 19:42               ` Usman Akinyemi
  2025-01-31 20:15                 ` Junio C Hamano
  2025-01-31 19:46               ` Usman Akinyemi
  1 sibling, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-31 19:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

On Fri, Jan 31, 2025 at 10:07 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > By the way, if we put the OS information in the "agent" capability,
> > how do we separate it from the existing "package/version" content and
> > make it easy to parse?
>
> Do NOT parse, period.
>
> If three "things" that talk the Git protocol on the other end of the
> connection gives "Linux git/2.48.0", and "macOS libgit2/1.9.0", and
> "Windows git/2.47.1" as their (enhanced) "agent" strings, there is
> no "ah, this one is 1.9.0 which way older than 2.47.1 so it must be
> missing features X and Y" the users of the information are allowed
> to infer.
Hi Junio,

Do you have any concerns "git/2.47.1 Windows" instead of
"Windows git/2.47.1" ?

Thank you.
>
> Just take it as a single opaque string, and group identical ones.
>
> In the above scenario, we found three different kinds now.  Maybe
> we'll accumulate the counts and notice that there are N times as
> many connections whose agent string begins with "Windows" as "Linux"
> and "macOS" combined or something.  That would be an offline
> analysis, and forcing users to do the stats offline would reduce the
> temptation to use it for purposes other than its intended one.
>
> You may find "ImNotTellingYou" and may wonder what OS the user is
> really using, but they do not want to tell you, so you honor their
> wish.
>
> > I don't see a good solution because
> > GIT_USER_AGENT could be used, and the config option to not show the OS
> > name could be used too.
>
> That is a good privacy measure.
>
> > Also we don't know what could be in the "version" part. The doc says
> > that the agent part is typically of the form "package/version" but
> > doesn't require it.
>
> Exactly.  I would think it is a feature, and the way to treat the
> string in line with the philosophy behind that feature is to take it
> as a single opaque thing.
>
>

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-31 16:37             ` Junio C Hamano
  2025-01-31 19:42               ` Usman Akinyemi
@ 2025-01-31 19:46               ` Usman Akinyemi
  2025-01-31 20:17                 ` Junio C Hamano
  1 sibling, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-01-31 19:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

On Fri, Jan 31, 2025 at 10:07 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > By the way, if we put the OS information in the "agent" capability,
> > how do we separate it from the existing "package/version" content and
> > make it easy to parse?
>
> Do NOT parse, period.
>
> If three "things" that talk the Git protocol on the other end of the
> connection gives "Linux git/2.48.0", and "macOS libgit2/1.9.0", and
> "Windows git/2.47.1" as their (enhanced) "agent" strings, there is
> no "ah, this one is 1.9.0 which way older than 2.47.1 so it must be
> missing features X and Y" the users of the information are allowed
> to infer.
>
> Just take it as a single opaque string, and group identical ones.
>
> In the above scenario, we found three different kinds now.  Maybe
> we'll accumulate the counts and notice that there are N times as
> many connections whose agent string begins with "Windows" as "Linux"
> and "macOS" combined or something.  That would be an offline
> analysis, and forcing users to do the stats offline would reduce the
> temptation to use it for purposes other than its intended one.
>
> You may find "ImNotTellingYou" and may wonder what OS the user is
> really using, but they do not want to tell you, so you honor their
> wish.
While the current implementation allows user to specify this form of string
 i.e "ImNotTellingYou", for agent value, it is not mentioned in the docs,
I will add in the next iteration.
>
> > I don't see a good solution because
> > GIT_USER_AGENT could be used, and the config option to not show the OS
> > name could be used too.
>
> That is a good privacy measure.
>
> > Also we don't know what could be in the "version" part. The doc says
> > that the agent part is typically of the form "package/version" but
> > doesn't require it.
>
> Exactly.  I would think it is a feature, and the way to treat the
> string in line with the philosophy behind that feature is to take it
> as a single opaque thing.
>
>

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-31 19:42               ` Usman Akinyemi
@ 2025-01-31 20:15                 ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-31 20:15 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: Christian Couder, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Do you have any concerns "git/2.47.1 Windows" instead of
> "Windows git/2.47.1" ?

Either is fine.  I expect that

 (1) Implementors on _our_ side will do the sensible thing and
     reviewers help them to make sure, where the definition of "the
     sensible thing" will be that whatever order we pick, we
     consistently use that same order.  If "git/2.47.1 Windows" is
     how GfW identifies itself, "git/2.48.1 Linux" or "git/2.49.0
     macOS" would be its contemporary counterparts, and _our_
     binaries would not identify themselves as "Linux git/2.49.0".

 (2) Implementors of third-party reimplementations of Git will just
     mimick what we will do, as long as we tell them our intention
     (i.e. this is a single opaque unparsable string to be collected
     for statistics, nothing more) clearly enough.

 (3) Most users are lazy and/or trusting enough that only a very few
     minority privacy conscious folks would configure it away,
     making their "IamNotTellingYou" agent string merely an
     insignificant noise in the statistics.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options
  2025-01-31 19:46               ` Usman Akinyemi
@ 2025-01-31 20:17                 ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-01-31 20:17 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: Christian Couder, git, ps, johncai86, Johannes.Schindelin, me,
	phillip.wood, rsbecker, sunshine

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

>> You may find "ImNotTellingYou" and may wonder what OS the user is
>> really using, but they do not want to tell you, so you honor their
>> wish.
> While the current implementation allows user to specify this form of string
>  i.e "ImNotTellingYou", for agent value, it is not mentioned in the docs,
> I will add in the next iteration.

OK.  You may want to wait before hearing other's opinions, though,
for at least the time it takes for the earth to rotate once.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v4 0/6][Outreachy] extend agent capability to include OS name
  2025-01-24 12:21     ` [PATCH v3 6/6] connect: advertise OS version Usman Akinyemi
@ 2025-02-05 18:52       ` Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
                           ` (6 more replies)
  0 siblings, 7 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-05 18:52 UTC (permalink / raw)
  To: git, =christian.couder
  Cc: gitster, Johannes.Schindelin, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, usmanakinyemi202

For debugging, statistical analysis, and security purposes, it can
be valuable for Git servers to know the operating system the clients
are using.

For example:
- A server noticing that a client is using an old Git version with
security issues on one platform, like macOS, could verify if the
user is indeed running macOS before sending a message to upgrade."
- Similarly, a server identifying a client that could benefit from
an upgrade (e.g., for performance reasons) could better customize the
message it sends to nudge the client to upgrade.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
The operating system name is retrieved using the 'sysname' field of 
he `uname(2)` system call or its equivalent.

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality.

Due to privacy issues and concerns, let's add the `transfer.advertiseOSVersion`
config option. It defaults to `true` and can be changed to `false`. When `true`,
both the client and server independently append their operating system name(os)
to the `agent` capability value. The `agent` capability will now be in form of
"package/version os" (e.g., "git/1.8.3.1 Linux"). When `false`, the `agent`
capability will be in the form of "package/version" e.g "git/1.8.3.1". The server's
configuration is independent of the client's. Defaults to `true`. 

Note that, due to differences between `uname(1)` (command-line
utility) and `uname(2)` (system call) outputs on Windows,
`transfer.advertiseOSVersion` is set to false on Windows during
testing. See the message part of patch 5/6 for more details.

My mentor, Christian Couder, sent a previous patch series about this
before. You can find it here
https://lore.kernel.org/git/20240619125708.3719150-1-christian.couder@gmail.com/

Changes since v3
================
 - Dropped the last patch which introduced `os-version` capability. This
   was as a result of discussion on the mailing list on why adding the
   operating system name to the existing agent capability might be better.
   I stated the reasons above and you can also check the discussion
   here.
   https://public-inbox.org/git/xmqqed0sxdiz.fsf@gitster.g/
 - Extend the agent capability to include the operating system name.

Usman Akinyemi (6):
  version: replace manual ASCII checks with isprint() for clarity
  version: refactor redact_non_printables()
  version: refactor get_uname_info()
  version: extend get_uname_info() to hide system details
  t5701: add setup test to remove side-effect dependency
  agent: advertise OS name via agent capability

 Documentation/config/transfer.txt |  8 ++++
 Documentation/gitprotocol-v2.txt  | 15 ++++--
 builtin/bugreport.c               | 13 +----
 t/t5555-http-smart-common.sh      | 10 +++-
 t/t5701-git-serve.sh              | 19 ++++++--
 t/test-lib-functions.sh           |  8 ++++
 version.c                         | 79 +++++++++++++++++++++++++++++--
 version.h                         | 22 +++++++++
 8 files changed, 149 insertions(+), 25 deletions(-)

Range-diff versus v3:

1:  82b62c5e66 = 1:  82b62c5e66 version: replace manual ASCII checks with isprint() for clarity
2:  0a7d7ce871 = 2:  0a7d7ce871 version: refactor redact_non_printables()
3:  0187db59a4 = 3:  0187db59a4 version: refactor get_uname_info()
4:  d3a3573594 = 4:  d3a3573594 version: extend get_uname_info() to hide system details
5:  d9edd2ffc8 ! 5:  3e0e98f23d t5701: add setup test to remove side-effect dependency
    @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      
     -test_expect_success 'test capability advertisement' '
     +test_expect_success 'setup to generate files with expected content' '
    -+	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_and_osversion &&
    ++	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
     +
      	test_oid_cache <<-EOF &&
      	wrong_algo sha1:sha256
    @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      	cat >expect.base <<-EOF &&
      	version 2
     -	agent=git/$(git version | cut -d" " -f3)
    -+	$(cat agent_and_osversion)
    ++	$(cat agent_capability)
      	ls-refs=unborn
      	fetch=shallow wait-for-done
      	server-option
6:  351d1eeddb < -:  ---------- connect: advertise OS version
-:  ---------- > 6:  67a2767026 agent: advertise OS name via agent capability

-- 
2.48.1


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v4 1/6] version: replace manual ASCII checks with isprint() for clarity
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
@ 2025-02-05 18:52         ` Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 2/6] version: refactor redact_non_printables() Usman Akinyemi
                           ` (5 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-05 18:52 UTC (permalink / raw)
  To: git, =christian.couder
  Cc: gitster, Johannes.Schindelin, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, usmanakinyemi202, Christian Couder

Since the isprint() function checks for printable characters, let's
replace the existing hardcoded ASCII checks with it. However, since
the original checks also handled spaces, we need to account for spaces
explicitly in the new check.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/version.c b/version.c
index 4d763ab48d..6cfbb8ca56 100644
--- a/version.c
+++ b/version.c
@@ -2,6 +2,7 @@
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
+#include "sane-ctype.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -29,7 +30,7 @@ const char *git_user_agent_sanitized(void)
 		strbuf_addstr(&buf, git_user_agent());
 		strbuf_trim(&buf);
 		for (size_t i = 0; i < buf.len; i++) {
-			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
+			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
 				buf.buf[i] = '.';
 		}
 		agent = buf.buf;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v4 2/6] version: refactor redact_non_printables()
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
@ 2025-02-05 18:52         ` Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 3/6] version: refactor get_uname_info() Usman Akinyemi
                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-05 18:52 UTC (permalink / raw)
  To: git, =christian.couder
  Cc: gitster, Johannes.Schindelin, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, usmanakinyemi202, Christian Couder

The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's use strbuf_detach() to explicitly detach the string
contained by the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/version.c b/version.c
index 6cfbb8ca56..60df71fd0e 100644
--- a/version.c
+++ b/version.c
@@ -7,6 +7,19 @@
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ */
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (!isprint(buf->buf[i]) || buf->buf[i] == ' ')
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -28,12 +41,8 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v4 3/6] version: refactor get_uname_info()
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 2/6] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-02-05 18:52         ` Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-05 18:52 UTC (permalink / raw)
  To: git, =christian.couder
  Cc: gitster, Johannes.Schindelin, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, usmanakinyemi202, Christian Couder

Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c | 13 ++-----------
 version.c           | 20 ++++++++++++++++++++
 version.h           |  7 +++++++
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 7c2df035c9..5e13d532a8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -12,10 +12,10 @@
 #include "diagnose.h"
 #include "object-file.h"
 #include "setup.h"
+#include "version.h"
 
 static void get_system_info(struct strbuf *sys_info)
 {
-	struct utsname uname_info;
 	char *shell = NULL;
 
 	/* get git version from native cmd */
@@ -24,16 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	if (uname(&uname_info))
-		strbuf_addf(sys_info, _("uname() failed with error '%s' (%d)\n"),
-			    strerror(errno),
-			    errno);
-	else
-		strbuf_addf(sys_info, "%s %s %s %s\n",
-			    uname_info.sysname,
-			    uname_info.release,
-			    uname_info.version,
-			    uname_info.machine);
+	get_uname_info(sys_info);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 60df71fd0e..3ec8b8243d 100644
--- a/version.c
+++ b/version.c
@@ -3,6 +3,7 @@
 #include "version-def.h"
 #include "strbuf.h"
 #include "sane-ctype.h"
+#include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -47,3 +48,22 @@ const char *git_user_agent_sanitized(void)
 
 	return agent;
 }
+
+int get_uname_info(struct strbuf *buf)
+{
+	struct utsname uname_info;
+
+	if (uname(&uname_info)) {
+		strbuf_addf(buf, _("uname() failed with error '%s' (%d)\n"),
+			    strerror(errno),
+			    errno);
+		return -1;
+	}
+
+	strbuf_addf(buf, "%s %s %s %s\n",
+		    uname_info.sysname,
+		    uname_info.release,
+		    uname_info.version,
+		    uname_info.machine);
+	return 0;
+}
diff --git a/version.h b/version.h
index 7c62e80577..afe3dbbab7 100644
--- a/version.h
+++ b/version.h
@@ -7,4 +7,11 @@ extern const char git_built_from_commit_string[];
 const char *git_user_agent(void);
 const char *git_user_agent_sanitized(void);
 
+/*
+  Try to get information about the system using uname(2).
+  Return -1 and put an error message into 'buf' in case of uname()
+  error. Return 0 and put uname info into 'buf' otherwise.
+*/
+int get_uname_info(struct strbuf *buf);
+
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v4 4/6] version: extend get_uname_info() to hide system details
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                           ` (2 preceding siblings ...)
  2025-02-05 18:52         ` [PATCH v4 3/6] version: refactor get_uname_info() Usman Akinyemi
@ 2025-02-05 18:52         ` Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-05 18:52 UTC (permalink / raw)
  To: git, =christian.couder
  Cc: gitster, Johannes.Schindelin, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, usmanakinyemi202, Christian Couder

Currently, get_uname_info() function provides the full OS information.
In a following commit, we will need it to provide only the OS name.

Let's extend it to accept a "full" flag that makes it switch between
providing full OS information and providing only the OS name.

We may need to refactor this function in the future if an
`osVersion.format` is added.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c |  2 +-
 version.c           | 16 +++++++++-------
 version.h           |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 5e13d532a8..e3288a86c8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -24,7 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	get_uname_info(sys_info);
+	get_uname_info(sys_info, 1);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 3ec8b8243d..d95221a72a 100644
--- a/version.c
+++ b/version.c
@@ -49,7 +49,7 @@ const char *git_user_agent_sanitized(void)
 	return agent;
 }
 
-int get_uname_info(struct strbuf *buf)
+int get_uname_info(struct strbuf *buf, unsigned int full)
 {
 	struct utsname uname_info;
 
@@ -59,11 +59,13 @@ int get_uname_info(struct strbuf *buf)
 			    errno);
 		return -1;
 	}
-
-	strbuf_addf(buf, "%s %s %s %s\n",
-		    uname_info.sysname,
-		    uname_info.release,
-		    uname_info.version,
-		    uname_info.machine);
+	if (full)
+		strbuf_addf(buf, "%s %s %s %s\n",
+			    uname_info.sysname,
+			    uname_info.release,
+			    uname_info.version,
+			    uname_info.machine);
+	else
+	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
diff --git a/version.h b/version.h
index afe3dbbab7..5eb586c0bd 100644
--- a/version.h
+++ b/version.h
@@ -12,6 +12,6 @@ const char *git_user_agent_sanitized(void);
   Return -1 and put an error message into 'buf' in case of uname()
   error. Return 0 and put uname info into 'buf' otherwise.
 */
-int get_uname_info(struct strbuf *buf);
+int get_uname_info(struct strbuf *buf, unsigned int full);
 
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v4 5/6] t5701: add setup test to remove side-effect dependency
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                           ` (3 preceding siblings ...)
  2025-02-05 18:52         ` [PATCH v4 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
@ 2025-02-05 18:52         ` Usman Akinyemi
  2025-02-05 18:52         ` [PATCH v4 6/6] agent: advertise OS name via agent capability Usman Akinyemi
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-05 18:52 UTC (permalink / raw)
  To: git, =christian.couder
  Cc: gitster, Johannes.Schindelin, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, usmanakinyemi202, Christian Couder

Currently, the "test capability advertisement" test creates some files
with expected content which are used by other tests below it.

To remove that side-effect from this test, let's split up part of
it into a "setup"-type test which creates the files with expected content
which gets reused by multiple tests. This will be useful in a following
commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 t/t5701-git-serve.sh | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index de904c1655..4c24a188b9 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -7,22 +7,28 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
-test_expect_success 'test capability advertisement' '
+test_expect_success 'setup to generate files with expected content' '
+	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
+
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
+
 	cat >expect.base <<-EOF &&
 	version 2
-	agent=git/$(git version | cut -d" " -f3)
+	$(cat agent_capability)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
 	object-format=$(test_oid algo)
 	EOF
-	cat >expect.trailer <<-EOF &&
+	cat >expect.trailer <<-EOF
 	0000
 	EOF
+'
+
+test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v4 6/6] agent: advertise OS name via agent capability
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                           ` (4 preceding siblings ...)
  2025-02-05 18:52         ` [PATCH v4 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-02-05 18:52         ` Usman Akinyemi
  2025-02-05 21:48           ` Junio C Hamano
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-05 18:52 UTC (permalink / raw)
  To: git, =christian.couder
  Cc: gitster, Johannes.Schindelin, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, usmanakinyemi202, Christian Couder

As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality.

Add the `transfer.advertiseOSInfo` config option to address privacy
concerns. It defaults to `true` and can be changed to `false`.
When `true`, both the client and server independently append their
operating system name(os) to the `agent` capability value. The `agent`
capability will now be in form of "package/version os" (e.g.,
"git/1.8.3.1 Linux"). When `false`, the `agent` capability will be
in the form of "package/version" e.g "git/1.8.3.1". The server's
configuration is independent of the client's. Defaults to `true`.
The operating system name is retrieved using the 'sysname' field of
the `uname(2)` system call or its equivalent.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

On Windows, uname(2) is not actually system-supplied but is instead
already faked up by Git itself. We could have overcome the test issue
on Windows by implementing a new `uname` subcommand in `test-tool`
using uname(2), but except uname(2), which would be tested against
itself, there would be nothing platform specific, so it's just simpler
to disable the tests on Windows.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/config/transfer.txt |  8 +++++++
 Documentation/gitprotocol-v2.txt  | 15 ++++++++-----
 t/t5555-http-smart-common.sh      | 10 ++++++++-
 t/t5701-git-serve.sh              |  9 +++++++-
 t/test-lib-functions.sh           |  8 +++++++
 version.c                         | 37 +++++++++++++++++++++++++++++++
 version.h                         | 15 +++++++++++++
 7 files changed, 95 insertions(+), 7 deletions(-)

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index f1ce50f4a6..1e1dc849ef 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -125,3 +125,11 @@ transfer.bundleURI::
 transfer.advertiseObjectInfo::
 	When `true`, the `object-info` capability is advertised by
 	servers. Defaults to false.
+
+transfer.advertiseOSInfo::
+	When `true`, both the client and server independently append their
+	operating system name (os) to the `agent` capability value. The `agent`
+	capability will now be in form of "package/version os" (e.g.,
+	"git/1.8.3.1 Linux"). When `false`, the `agent` capability will be
+	in the form of "package/version" e.g "git/1.8.3.1". The server's
+	configuration is independent of the client's. Defaults to `true`.
diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 1652fef3ae..8fab7d7d52 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -184,11 +184,16 @@ form `agent=X`) to notify the client that the server is running version
 the `agent` capability with a value `Y` (in the form `agent=Y`) in its
 request to the server (but it MUST NOT do so if the server did not
 advertise the agent capability). The `X` and `Y` strings may contain any
-printable ASCII characters except space (i.e., the byte range 32 < x <
-127), and are typically of the form "package/version" (e.g.,
-"git/1.8.3.1"). The agent strings are purely informative for statistics
-and debugging purposes, and MUST NOT be used to programmatically assume
-the presence or absence of particular features.
+printable ASCII characters (i.e., the byte range 32 < x < 127), and are
+typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")
+where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
+be configured using the GIT_USER_AGENT environment variable and it takes
+priority. If `transfer.advertiseOSInfo` is `false` on the server, the server
+omits the `os` from X. If it is `false` on the client, the client omits the
+`os` from `Y`. The `os` is retrieved using the 'sysname' field of the `uname(2)`
+system call or its equivalent. The agent strings are purely informative for
+statistics and debugging purposes, and MUST NOT be used to programmatically
+assume the presence or absence of particular features.
 
 ls-refs
 ~~~~~~~
diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
index e47ea1ad10..140a7f0ffb 100755
--- a/t/t5555-http-smart-common.sh
+++ b/t/t5555-http-smart-common.sh
@@ -123,9 +123,17 @@ test_expect_success 'git receive-pack --advertise-refs: v1' '
 '
 
 test_expect_success 'git upload-pack --advertise-refs: v2' '
+	printf "agent=FAKE" >agent_capability &&
+	if test_have_prereq WINDOWS
+	then
+		printf "\n" >>agent_capability &&
+		git config transfer.advertiseOSInfo false
+	else
+		printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability
+	fi &&
 	cat >expect <<-EOF &&
 	version 2
-	agent=FAKE
+	$(cat agent_capability)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 4c24a188b9..a4c12372f8 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -8,13 +8,20 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 . ./test-lib.sh
 
 test_expect_success 'setup to generate files with expected content' '
-	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
+	printf "agent=git/%s" "$(git version | cut -d" " -f3)" >agent_capability &&
 
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
 
+	if test_have_prereq WINDOWS
+	then
+		printf "\n" >>agent_capability &&
+		git config transfer.advertiseOSInfo false
+	else
+		printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability
+	fi &&
 	cat >expect.base <<-EOF &&
 	version 2
 	$(cat agent_capability)
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..3465904323 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -2007,3 +2007,11 @@ test_trailing_hash () {
 		test-tool hexdump |
 		sed "s/ //g"
 }
+
+# Trim and replace each character with ascii code below 32 or above
+# 127 (included) using a dot '.' character.
+# Octal intervals \001-\040 and \177-\377
+# correspond to decimal intervals 1-32 and 127-255
+test_redact_non_printables () {
+    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
+}
diff --git a/version.c b/version.c
index d95221a72a..f0f936a75e 100644
--- a/version.c
+++ b/version.c
@@ -1,9 +1,12 @@
+#define USE_THE_REPOSITORY_VARIABLE
+
 #include "git-compat-util.h"
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
 #include "sane-ctype.h"
 #include "gettext.h"
+#include "config.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -43,6 +46,12 @@ const char *git_user_agent_sanitized(void)
 
 		strbuf_addstr(&buf, git_user_agent());
 		redact_non_printables(&buf);
+		/* Add os name if the transfer.advertiseosinfo config is true */
+		if (advertise_os_info()) {
+			/* Add space to space character after git version string */
+			strbuf_addch(&buf, ' ');
+			strbuf_addstr(&buf, os_info_sanitized());
+		}
 		agent = strbuf_detach(&buf, NULL);
 	}
 
@@ -69,3 +78,31 @@ int get_uname_info(struct strbuf *buf, unsigned int full)
 	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
+
+const char *os_info_sanitized(void)
+{
+	static const char *os = NULL;
+
+	if (!os) {
+		struct strbuf buf = STRBUF_INIT;
+
+		get_uname_info(&buf, 0);
+		/* Sanitize the os information immediately */
+		redact_non_printables(&buf);
+		os = strbuf_detach(&buf, NULL);
+	}
+
+	return os;
+}
+
+int advertise_os_info(void)
+{
+	static int transfer_advertise_os_info= -1;
+
+	if (transfer_advertise_os_info == -1) {
+		repo_config_get_bool(the_repository, "transfer.advertiseosinfo", &transfer_advertise_os_info);
+		/* enabled by default */
+		transfer_advertise_os_info = !!transfer_advertise_os_info;
+	}
+	return transfer_advertise_os_info;
+}
diff --git a/version.h b/version.h
index 5eb586c0bd..b2325865d7 100644
--- a/version.h
+++ b/version.h
@@ -1,6 +1,8 @@
 #ifndef VERSION_H
 #define VERSION_H
 
+struct repository;
+
 extern const char git_version_string[];
 extern const char git_built_from_commit_string[];
 
@@ -14,4 +16,17 @@ const char *git_user_agent_sanitized(void);
 */
 int get_uname_info(struct strbuf *buf, unsigned int full);
 
+/*
+  Retrieve, sanitize and cache operating system info for subsequent
+  calls. Return a pointer to the sanitized operating system info
+  string.
+*/
+const char *os_info_sanitized(void);
+
+/*
+  Retrieve and cache transfer.advertiseosinfo config value. Return 1
+  if true, 0 if false.
+*/
+int advertise_os_info(void);
+
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH v4 6/6] agent: advertise OS name via agent capability
  2025-02-05 18:52         ` [PATCH v4 6/6] agent: advertise OS name via agent capability Usman Akinyemi
@ 2025-02-05 21:48           ` Junio C Hamano
  2025-02-06  6:37             ` Usman Akinyemi
  2025-02-07 19:25             ` Usman Akinyemi
  0 siblings, 2 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-02-05 21:48 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> As some issues that can happen with a Git client can be operating system
> specific, it can be useful for a server to know which OS a client is
> using. In the same way it can be useful for a client to know which OS
> a server is using.
>
> Our current agent capability is in the form of "package/version" (e.g.,
> "git/1.8.3.1"). Let's extend it to include the operating system name (os)
> i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
>
> Including OS details in the agent capability simplifies implementation,
> maintains backward compatibility, avoids introducing a new capability,
> encourages adoption across Git-compatible software, and enhances
> debugging by providing complete environment information without affecting
> functionality.

I obviously agree with the benefits enumerated in the above
paragraph.  The simpler, the better.

I however wonder ...

> Add the `transfer.advertiseOSInfo` config option to address privacy
> concerns. It defaults to `true` and can be changed to `false`.

... if this configuration knob is at the right granularity.

For privacy concious folks, I would imagine that the distinction
between "git/1.8.3.1" vs "git/2.48.1" would be something they do not
want to reveal equally as, if not more than, which Operating System
they are on.  Such a privacy concious user may already be using
GIT_USER_AGENT environment variable to squelch it already, anyway.

If we were to give them an improvement in the area for privacy
features, I would think it would be to add a configuration variable
to turn the agent off, instead of having to leave GIT_USER_AGENT
environment variable set in the environment of their processes.

On the other hand, for the rest of us who think "git/1.8.3.1 Linux"
is not too much of a secret, we do not need a knob to configure it
between "git/1.8.3.1" and "git/1.8.3.1 Linux".

So, while I view some parts of the series would have been a good
exercise to use various features (like config subsystem) from our
API, I prefer if we kept the end-user interface not overly
customizable (iow, without a config-knob, we do not need to add a
code to inspect the new configuration variable).

After all, GIT_USER_AGENT let's you hide not just the OS part but
any other things from the user-agent string already.

I notice that unlike user_agent() vs user_agent_sanitized(), you
only have a single function for os_info(), which I think is a good
design.  But if we were to go that route, shouldn't we call the
function os_info(), not os_info_sanitized()?  The idea behind a
single function is that you cannot obtain unsanitized version of
os_info() out of the system at all, so what _sanitized() returns
would be what os_info() without _sanitized suffix would return to
the caller anyway.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v4 6/6] agent: advertise OS name via agent capability
  2025-02-05 21:48           ` Junio C Hamano
@ 2025-02-06  6:37             ` Usman Akinyemi
  2025-02-06 15:13               ` Junio C Hamano
  2025-02-07 19:25             ` Usman Akinyemi
  1 sibling, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-06  6:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

On Thu, Feb 6, 2025 at 3:18 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > As some issues that can happen with a Git client can be operating system
> > specific, it can be useful for a server to know which OS a client is
> > using. In the same way it can be useful for a client to know which OS
> > a server is using.
> >
> > Our current agent capability is in the form of "package/version" (e.g.,
> > "git/1.8.3.1"). Let's extend it to include the operating system name (os)
> > i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
> >
> > Including OS details in the agent capability simplifies implementation,
> > maintains backward compatibility, avoids introducing a new capability,
> > encourages adoption across Git-compatible software, and enhances
> > debugging by providing complete environment information without affecting
> > functionality.
>
> I obviously agree with the benefits enumerated in the above
> paragraph.  The simpler, the better.
>
> I however wonder ...
>
> > Add the `transfer.advertiseOSInfo` config option to address privacy
> > concerns. It defaults to `true` and can be changed to `false`.
>
> ... if this configuration knob is at the right granularity.
>
> For privacy concious folks, I would imagine that the distinction
> between "git/1.8.3.1" vs "git/2.48.1" would be something they do not
> want to reveal equally as, if not more than, which Operating System
> they are on.  Such a privacy concious user may already be using
> GIT_USER_AGENT environment variable to squelch it already, anyway.
>
> If we were to give them an improvement in the area for privacy
> features, I would think it would be to add a configuration variable
> to turn the agent off, instead of having to leave GIT_USER_AGENT
> environment variable set in the environment of their processes.
>
> On the other hand, for the rest of us who think "git/1.8.3.1 Linux"
> is not too much of a secret, we do not need a knob to configure it
> between "git/1.8.3.1" and "git/1.8.3.1 Linux".
>
> So, while I view some parts of the series would have been a good
> exercise to use various features (like config subsystem) from our
> API, I prefer if we kept the end-user interface not overly
> customizable (iow, without a config-knob, we do not need to add a
> code to inspect the new configuration variable).
>
> After all, GIT_USER_AGENT let's you hide not just the OS part but
> any other things from the user-agent string already.
Hi Junio,

The conclusion now is that we should not add any config option since
the GIT_USER_AGENT could actually allow the user to hide whatever
info they do not want to share ?
>
> I notice that unlike user_agent() vs user_agent_sanitized(), you
> only have a single function for os_info(), which I think is a good
> design.  But if we were to go that route, shouldn't we call the
> function os_info(), not os_info_sanitized()?  The idea behind a
> single function is that you cannot obtain unsanitized version of
> os_info() out of the system at all, so what _sanitized() returns
> would be what os_info() without _sanitized suffix would return to
> the caller anyway.
Yeah, we can change it to os_info, if in the future someone needs
the os information in some way, they could use the get_uname_info.

Thanks.
>
> Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v4 6/6] agent: advertise OS name via agent capability
  2025-02-06  6:37             ` Usman Akinyemi
@ 2025-02-06 15:13               ` Junio C Hamano
  2025-02-07 17:27                 ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-02-06 15:13 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

>> I obviously agree with the benefits enumerated in the above
>> paragraph.  The simpler, the better.
>>
>> I however wonder ...
>>
>> > Add the `transfer.advertiseOSInfo` config option to address privacy
>> > concerns. It defaults to `true` and can be changed to `false`.
>>
>> ... if this configuration knob is at the right granularity.
>
> The conclusion now is that we should not add any config option since
> the GIT_USER_AGENT could actually allow the user to hide whatever
> info they do not want to share ?

I wouldn't call that a conclusion (as you and I are the only people
who expressed their opinion on this so far), but that is my take on
it---tweaking only the (os) part in the agent string with a config
smells like the tweakability is at a wrong level.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v4 6/6] agent: advertise OS name via agent capability
  2025-02-06 15:13               ` Junio C Hamano
@ 2025-02-07 17:27                 ` Usman Akinyemi
  2025-02-07 17:57                   ` Junio C Hamano
  0 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-07 17:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

On Thu, Feb 6, 2025 at 8:43 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> >> I obviously agree with the benefits enumerated in the above
> >> paragraph.  The simpler, the better.
> >>
> >> I however wonder ...
> >>
> >> > Add the `transfer.advertiseOSInfo` config option to address privacy
> >> > concerns. It defaults to `true` and can be changed to `false`.
> >>
> >> ... if this configuration knob is at the right granularity.
> >
> > The conclusion now is that we should not add any config option since
> > the GIT_USER_AGENT could actually allow the user to hide whatever
> > info they do not want to share ?
>
> I wouldn't call that a conclusion (as you and I are the only people
> who expressed their opinion on this so far), but that is my take on
> it---tweaking only the (os) part in the agent string with a config
> smells like the tweakability is at a wrong level.
>
Hi Junio,

I was actually thinking about this inside the bathroom when it
occurred to me that,
according to the current implementation, GIT_USER_AGENT will not allow the user
to specify an empty string at all. It is either you specify some value
or we decide for
you. I think we can add the config at a level that can disable the
agent capability completely
instead of only tweaking the (os) part.

With this, the user can disable the agent  capability completely,
share whatever string they want using the GIT_USER_AGENT.

What do you think ?

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v4 6/6] agent: advertise OS name via agent capability
  2025-02-07 17:27                 ` Usman Akinyemi
@ 2025-02-07 17:57                   ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-02-07 17:57 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: git, christian.couder, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> I was actually thinking about this inside the bathroom when it
> occurred to me that,
> according to the current implementation, GIT_USER_AGENT will not allow the user
> to specify an empty string at all. It is either you specify some value
> or we decide for
> you.

Yes.  GIT_USER_AGENT=ImNotTellingYou would work just fine for
privacy concious folks.

> I think we can add the config at a level that can disable the
> agent capability completely
> instead of only tweaking the (os) part.

Yes, go back to a few messages you received from me earlier; it is
already there ;-)

    If we were to give them an improvement in the area for privacy
    features, I would think it would be to add a configuration variable
    to turn the agent off, instead of having to leave GIT_USER_AGENT
    environment variable set in the environment of their processes.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v4 6/6] agent: advertise OS name via agent capability
  2025-02-05 21:48           ` Junio C Hamano
  2025-02-06  6:37             ` Usman Akinyemi
@ 2025-02-07 19:25             ` Usman Akinyemi
  1 sibling, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-07 19:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

On Thu, Feb 6, 2025 at 3:18 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
>
> > As some issues that can happen with a Git client can be operating system
> > specific, it can be useful for a server to know which OS a client is
> > using. In the same way it can be useful for a client to know which OS
> > a server is using.
> >
> > Our current agent capability is in the form of "package/version" (e.g.,
> > "git/1.8.3.1"). Let's extend it to include the operating system name (os)
> > i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
> >
> > Including OS details in the agent capability simplifies implementation,
> > maintains backward compatibility, avoids introducing a new capability,
> > encourages adoption across Git-compatible software, and enhances
> > debugging by providing complete environment information without affecting
> > functionality.
>
> I obviously agree with the benefits enumerated in the above
> paragraph.  The simpler, the better.
>
> I however wonder ...
>
> > Add the `transfer.advertiseOSInfo` config option to address privacy
> > concerns. It defaults to `true` and can be changed to `false`.
>
> ... if this configuration knob is at the right granularity.
>
> For privacy concious folks, I would imagine that the distinction
> between "git/1.8.3.1" vs "git/2.48.1" would be something they do not
> want to reveal equally as, if not more than, which Operating System
> they are on.  Such a privacy concious user may already be using
> GIT_USER_AGENT environment variable to squelch it already, anyway.
>
> If we were to give them an improvement in the area for privacy
> features, I would think it would be to add a configuration variable
> to turn the agent off, instead of having to leave GIT_USER_AGENT
> environment variable set in the environment of their processes.
>
> On the other hand, for the rest of us who think "git/1.8.3.1 Linux"
> is not too much of a secret, we do not need a knob to configure it
> between "git/1.8.3.1" and "git/1.8.3.1 Linux".
>
> So, while I view some parts of the series would have been a good
> exercise to use various features (like config subsystem) from our
> API, I prefer if we kept the end-user interface not overly
> customizable (iow, without a config-knob, we do not need to add a
> code to inspect the new configuration variable).
Hi Junio,

Yeah, I believe appending the (os) to the agent might attract the attention
of some set of privacy conscious users who might not really be worried
about the agent when it was just a string like "git/1.8.3.1". While
GIT_USER_AGENT
can be used to suppress it, I believe having a dedicated config option to
completely disable the agent is a more user-friendly and persistent approach
than relying solely on environment variables.
Requiring users to manually set GIT_USER_AGENT (since it cannot be empty)
can feel cumbersome, making a config option a cleaner and more
intuitive alternative.
Additionally, having a config option would provide a consistent
mechanism in case
similar privacy-related features are introduced in the future.

This is me convincing you that having a config option to disable the
agent is more user friendly than having only the environment variable for
users who do not want to share anything at all.

What do you think ? Or maybe there is strong reason for having the
GIT_USER_AGENT in the first place and not having a config to disable
the agent capability?

We could also wait for input from other community members.

Thanks,
Usman

>
> After all, GIT_USER_AGENT let's you hide not just the OS part but
> any other things from the user-agent string already.
>
> I notice that unlike user_agent() vs user_agent_sanitized(), you
> only have a single function for os_info(), which I think is a good
> design.  But if we were to go that route, shouldn't we call the
> function os_info(), not os_info_sanitized()?  The idea behind a
> single function is that you cannot obtain unsanitized version of
> os_info() out of the system at all, so what _sanitized() returns
> would be what os_info() without _sanitized suffix would return to
> the caller anyway.
>
> Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v5 0/6][Outreachy] extend agent capability to include OS name
  2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                           ` (5 preceding siblings ...)
  2025-02-05 18:52         ` [PATCH v4 6/6] agent: advertise OS name via agent capability Usman Akinyemi
@ 2025-02-14 12:36         ` Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
                             ` (6 more replies)
  6 siblings, 7 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-14 12:36 UTC (permalink / raw)
  To: christian.couder, git
  Cc: Johannes.Schindelin, gitster, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine

For debugging, statistical analysis, and security purposes, it can
be valuable for Git servers to know the operating system the clients
are using.

For example:
- A server noticing that a client is using an old Git version with
security issues on one platform, like macOS, could verify if the
user is indeed running macOS before sending a message to upgrade."
- Similarly, a server identifying a client that could benefit from
an upgrade (e.g., for performance reasons) could better customize the
message it sends to nudge the client to upgrade.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
The operating system name is retrieved using the 'sysname' field of 
he `uname(2)` system call or its equivalent.

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality.

Note that, due to differences between `uname(1)` (command-line
utility) and `uname(2)` (system call) outputs on Windows,
`transfer.advertiseOSVersion` is set to false on Windows during
testing. See the message part of patch 5/6 for more details.

My mentor, Christian Couder, sent a previous patch series about this
before. You can find it here
https://lore.kernel.org/git/20240619125708.3719150-1-christian.couder@gmail.com/

Changes since v4
================
 - Remove the implementation of transfer.advertiseOSInfo config. 
 - Update the documentation.
 - Move the `os_info()` function into "version.c" file.

Usman Akinyemi (6):
  version: replace manual ASCII checks with isprint() for clarity
  version: refactor redact_non_printables()
  version: refactor get_uname_info()
  version: extend get_uname_info() to hide system details
  t5701: add setup test to remove side-effect dependency
  agent: advertise OS name via agent capability

 Documentation/gitprotocol-v2.txt | 13 +++---
 builtin/bugreport.c              | 13 +-----
 t/t5701-git-serve.sh             | 26 ++++++++++--
 t/test-lib-functions.sh          |  8 ++++
 version.c                        | 69 +++++++++++++++++++++++++++++---
 version.h                        | 10 +++++
 6 files changed, 115 insertions(+), 24 deletions(-)

Range-diff versus v4:

1:  82b62c5e66 = 1:  82b62c5e66 version: replace manual ASCII checks with isprint() for clarity
2:  0a7d7ce871 = 2:  0a7d7ce871 version: refactor redact_non_printables()
3:  0187db59a4 = 3:  0187db59a4 version: refactor get_uname_info()
4:  d3a3573594 = 4:  d3a3573594 version: extend get_uname_info() to hide system details
5:  3e0e98f23d = 5:  3e0e98f23d t5701: add setup test to remove side-effect dependency
6:  67a2767026 ! 6:  bcd1130aa1 agent: advertise OS name via agent capability
    @@ Commit message
         maintains backward compatibility, avoids introducing a new capability,
         encourages adoption across Git-compatible software, and enhances
         debugging by providing complete environment information without affecting
    -    functionality.
    -
    -    Add the `transfer.advertiseOSInfo` config option to address privacy
    -    concerns. It defaults to `true` and can be changed to `false`.
    -    When `true`, both the client and server independently append their
    -    operating system name(os) to the `agent` capability value. The `agent`
    -    capability will now be in form of "package/version os" (e.g.,
    -    "git/1.8.3.1 Linux"). When `false`, the `agent` capability will be
    -    in the form of "package/version" e.g "git/1.8.3.1". The server's
    -    configuration is independent of the client's. Defaults to `true`.
    -    The operating system name is retrieved using the 'sysname' field of
    -    the `uname(2)` system call or its equivalent.
    +    functionality. The operating system name is retrieved using the 'sysname'
    +    field of the `uname(2)` system call or its equivalent.
     
         However, there are differences between `uname(1)` (command-line utility)
         and `uname(2)` (system call) outputs on Windows. These discrepancies
    @@ Commit message
         Mentored-by: Christian Couder <chriscool@tuxfamily.org>
         Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
     
    - ## Documentation/config/transfer.txt ##
    -@@ Documentation/config/transfer.txt: transfer.bundleURI::
    - transfer.advertiseObjectInfo::
    - 	When `true`, the `object-info` capability is advertised by
    - 	servers. Defaults to false.
    -+
    -+transfer.advertiseOSInfo::
    -+	When `true`, both the client and server independently append their
    -+	operating system name (os) to the `agent` capability value. The `agent`
    -+	capability will now be in form of "package/version os" (e.g.,
    -+	"git/1.8.3.1 Linux"). When `false`, the `agent` capability will be
    -+	in the form of "package/version" e.g "git/1.8.3.1". The server's
    -+	configuration is independent of the client's. Defaults to `true`.
    -
      ## Documentation/gitprotocol-v2.txt ##
     @@ Documentation/gitprotocol-v2.txt: form `agent=X`) to notify the client that the server is running version
      the `agent` capability with a value `Y` (in the form `agent=Y`) in its
    @@ Documentation/gitprotocol-v2.txt: form `agent=X`) to notify the client that the
     -"git/1.8.3.1"). The agent strings are purely informative for statistics
     -and debugging purposes, and MUST NOT be used to programmatically assume
     -the presence or absence of particular features.
    -+printable ASCII characters (i.e., the byte range 32 < x < 127), and are
    ++printable ASCII characters (i.e., the byte range 31 < x < 127), and are
     +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")
     +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
     +be configured using the GIT_USER_AGENT environment variable and it takes
    -+priority. If `transfer.advertiseOSInfo` is `false` on the server, the server
    -+omits the `os` from X. If it is `false` on the client, the client omits the
    -+`os` from `Y`. The `os` is retrieved using the 'sysname' field of the `uname(2)`
    ++priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
     +system call or its equivalent. The agent strings are purely informative for
     +statistics and debugging purposes, and MUST NOT be used to programmatically
     +assume the presence or absence of particular features.
    @@ Documentation/gitprotocol-v2.txt: form `agent=X`) to notify the client that the
      ls-refs
      ~~~~~~~
     
    - ## t/t5555-http-smart-common.sh ##
    -@@ t/t5555-http-smart-common.sh: test_expect_success 'git receive-pack --advertise-refs: v1' '
    - '
    - 
    - test_expect_success 'git upload-pack --advertise-refs: v2' '
    -+	printf "agent=FAKE" >agent_capability &&
    -+	if test_have_prereq WINDOWS
    -+	then
    -+		printf "\n" >>agent_capability &&
    -+		git config transfer.advertiseOSInfo false
    -+	else
    -+		printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability
    -+	fi &&
    - 	cat >expect <<-EOF &&
    - 	version 2
    --	agent=FAKE
    -+	$(cat agent_capability)
    - 	ls-refs=unborn
    - 	fetch=shallow wait-for-done
    - 	server-option
    -
      ## t/t5701-git-serve.sh ##
     @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      . ./test-lib.sh
    @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      
     +	if test_have_prereq WINDOWS
     +	then
    -+		printf "\n" >>agent_capability &&
    -+		git config transfer.advertiseOSInfo false
    ++		printf "agent=FAKE\n" >agent_capability
     +	else
     +		printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability
     +	fi &&
      	cat >expect.base <<-EOF &&
      	version 2
      	$(cat agent_capability)
    +@@ t/t5701-git-serve.sh: test_expect_success 'setup to generate files with expected content' '
    + test_expect_success 'test capability advertisement' '
    + 	cat expect.base expect.trailer >expect &&
    + 
    ++	if test_have_prereq WINDOWS
    ++	then
    ++		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
    ++	fi &&
    + 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
    + 		--advertise-capabilities >out &&
    + 	test-tool pkt-line unpack <out >actual &&
    +@@ t/t5701-git-serve.sh: test_expect_success 'test capability advertisement with uploadpack.advertiseBund
    + 	    expect.extra \
    + 	    expect.trailer >expect &&
    + 
    ++	if test_have_prereq WINDOWS
    ++	then
    ++		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
    ++	fi &&
    + 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
    + 		--advertise-capabilities >out &&
    + 	test-tool pkt-line unpack <out >actual &&
     
      ## t/test-lib-functions.sh ##
     @@ t/test-lib-functions.sh: test_trailing_hash () {
    @@ version.c
      #include "version.h"
      #include "version-def.h"
      #include "strbuf.h"
    - #include "sane-ctype.h"
    +-#include "sane-ctype.h"
      #include "gettext.h"
    -+#include "config.h"
      
      const char git_version_string[] = GIT_VERSION;
    - const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
    -@@ version.c: const char *git_user_agent_sanitized(void)
    - 
    - 		strbuf_addstr(&buf, git_user_agent());
    - 		redact_non_printables(&buf);
    -+		/* Add os name if the transfer.advertiseosinfo config is true */
    -+		if (advertise_os_info()) {
    -+			/* Add space to space character after git version string */
    -+			strbuf_addch(&buf, ' ');
    -+			strbuf_addstr(&buf, os_info_sanitized());
    -+		}
    - 		agent = strbuf_detach(&buf, NULL);
    - 	}
    - 
    -@@ version.c: int get_uname_info(struct strbuf *buf, unsigned int full)
    - 	     strbuf_addf(buf, "%s\n", uname_info.sysname);
    - 	return 0;
    +@@ version.c: const char *git_user_agent(void)
    + 	return agent;
      }
    -+
    -+const char *os_info_sanitized(void)
    + 
    ++/*
    ++  Retrieve, sanitize and cache operating system info for subsequent
    ++  calls. Return a pointer to the sanitized operating system info
    ++  string.
    ++*/
    ++static const char *os_info(void)
     +{
     +	static const char *os = NULL;
     +
    @@ version.c: int get_uname_info(struct strbuf *buf, unsigned int full)
     +	return os;
     +}
     +
    -+int advertise_os_info(void)
    -+{
    -+	static int transfer_advertise_os_info= -1;
    + const char *git_user_agent_sanitized(void)
    + {
    + 	static const char *agent = NULL;
    +@@ version.c: const char *git_user_agent_sanitized(void)
    + 
    + 		strbuf_addstr(&buf, git_user_agent());
    + 		redact_non_printables(&buf);
     +
    -+	if (transfer_advertise_os_info == -1) {
    -+		repo_config_get_bool(the_repository, "transfer.advertiseosinfo", &transfer_advertise_os_info);
    -+		/* enabled by default */
    -+		transfer_advertise_os_info = !!transfer_advertise_os_info;
    -+	}
    -+	return transfer_advertise_os_info;
    -+}
    ++		if (!getenv("GIT_USER_AGENT")) {
    ++			strbuf_addch(&buf, ' ');
    ++			strbuf_addstr(&buf, os_info());
    ++		}
    + 		agent = strbuf_detach(&buf, NULL);
    + 	}
    + 
     
      ## version.h ##
     @@
    @@ version.h: const char *git_user_agent_sanitized(void);
      */
      int get_uname_info(struct strbuf *buf, unsigned int full);
      
    -+/*
    -+  Retrieve, sanitize and cache operating system info for subsequent
    -+  calls. Return a pointer to the sanitized operating system info
    -+  string.
    -+*/
    -+const char *os_info_sanitized(void);
    -+
    -+/*
    -+  Retrieve and cache transfer.advertiseosinfo config value. Return 1
    -+  if true, 0 if false.
    -+*/
    -+int advertise_os_info(void);
     +
      #endif /* VERSION_H */

-- 
2.48.1


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v5 1/6] version: replace manual ASCII checks with isprint() for clarity
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
@ 2025-02-14 12:36           ` Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 2/6] version: refactor redact_non_printables() Usman Akinyemi
                             ` (5 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-14 12:36 UTC (permalink / raw)
  To: christian.couder, git
  Cc: Johannes.Schindelin, gitster, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Since the isprint() function checks for printable characters, let's
replace the existing hardcoded ASCII checks with it. However, since
the original checks also handled spaces, we need to account for spaces
explicitly in the new check.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/version.c b/version.c
index 4d763ab48d..6cfbb8ca56 100644
--- a/version.c
+++ b/version.c
@@ -2,6 +2,7 @@
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
+#include "sane-ctype.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -29,7 +30,7 @@ const char *git_user_agent_sanitized(void)
 		strbuf_addstr(&buf, git_user_agent());
 		strbuf_trim(&buf);
 		for (size_t i = 0; i < buf.len; i++) {
-			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
+			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
 				buf.buf[i] = '.';
 		}
 		agent = buf.buf;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v5 2/6] version: refactor redact_non_printables()
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
@ 2025-02-14 12:36           ` Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 3/6] version: refactor get_uname_info() Usman Akinyemi
                             ` (4 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-14 12:36 UTC (permalink / raw)
  To: christian.couder, git
  Cc: Johannes.Schindelin, gitster, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's use strbuf_detach() to explicitly detach the string
contained by the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/version.c b/version.c
index 6cfbb8ca56..60df71fd0e 100644
--- a/version.c
+++ b/version.c
@@ -7,6 +7,19 @@
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ */
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (!isprint(buf->buf[i]) || buf->buf[i] == ' ')
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -28,12 +41,8 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v5 3/6] version: refactor get_uname_info()
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 2/6] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-02-14 12:36           ` Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
                             ` (3 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-14 12:36 UTC (permalink / raw)
  To: christian.couder, git
  Cc: Johannes.Schindelin, gitster, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c | 13 ++-----------
 version.c           | 20 ++++++++++++++++++++
 version.h           |  7 +++++++
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 7c2df035c9..5e13d532a8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -12,10 +12,10 @@
 #include "diagnose.h"
 #include "object-file.h"
 #include "setup.h"
+#include "version.h"
 
 static void get_system_info(struct strbuf *sys_info)
 {
-	struct utsname uname_info;
 	char *shell = NULL;
 
 	/* get git version from native cmd */
@@ -24,16 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	if (uname(&uname_info))
-		strbuf_addf(sys_info, _("uname() failed with error '%s' (%d)\n"),
-			    strerror(errno),
-			    errno);
-	else
-		strbuf_addf(sys_info, "%s %s %s %s\n",
-			    uname_info.sysname,
-			    uname_info.release,
-			    uname_info.version,
-			    uname_info.machine);
+	get_uname_info(sys_info);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 60df71fd0e..3ec8b8243d 100644
--- a/version.c
+++ b/version.c
@@ -3,6 +3,7 @@
 #include "version-def.h"
 #include "strbuf.h"
 #include "sane-ctype.h"
+#include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -47,3 +48,22 @@ const char *git_user_agent_sanitized(void)
 
 	return agent;
 }
+
+int get_uname_info(struct strbuf *buf)
+{
+	struct utsname uname_info;
+
+	if (uname(&uname_info)) {
+		strbuf_addf(buf, _("uname() failed with error '%s' (%d)\n"),
+			    strerror(errno),
+			    errno);
+		return -1;
+	}
+
+	strbuf_addf(buf, "%s %s %s %s\n",
+		    uname_info.sysname,
+		    uname_info.release,
+		    uname_info.version,
+		    uname_info.machine);
+	return 0;
+}
diff --git a/version.h b/version.h
index 7c62e80577..afe3dbbab7 100644
--- a/version.h
+++ b/version.h
@@ -7,4 +7,11 @@ extern const char git_built_from_commit_string[];
 const char *git_user_agent(void);
 const char *git_user_agent_sanitized(void);
 
+/*
+  Try to get information about the system using uname(2).
+  Return -1 and put an error message into 'buf' in case of uname()
+  error. Return 0 and put uname info into 'buf' otherwise.
+*/
+int get_uname_info(struct strbuf *buf);
+
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v5 4/6] version: extend get_uname_info() to hide system details
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                             ` (2 preceding siblings ...)
  2025-02-14 12:36           ` [PATCH v5 3/6] version: refactor get_uname_info() Usman Akinyemi
@ 2025-02-14 12:36           ` Usman Akinyemi
  2025-02-14 12:36           ` [PATCH v5 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
                             ` (2 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-14 12:36 UTC (permalink / raw)
  To: christian.couder, git
  Cc: Johannes.Schindelin, gitster, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Currently, get_uname_info() function provides the full OS information.
In a following commit, we will need it to provide only the OS name.

Let's extend it to accept a "full" flag that makes it switch between
providing full OS information and providing only the OS name.

We may need to refactor this function in the future if an
`osVersion.format` is added.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c |  2 +-
 version.c           | 16 +++++++++-------
 version.h           |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 5e13d532a8..e3288a86c8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -24,7 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	get_uname_info(sys_info);
+	get_uname_info(sys_info, 1);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 3ec8b8243d..d95221a72a 100644
--- a/version.c
+++ b/version.c
@@ -49,7 +49,7 @@ const char *git_user_agent_sanitized(void)
 	return agent;
 }
 
-int get_uname_info(struct strbuf *buf)
+int get_uname_info(struct strbuf *buf, unsigned int full)
 {
 	struct utsname uname_info;
 
@@ -59,11 +59,13 @@ int get_uname_info(struct strbuf *buf)
 			    errno);
 		return -1;
 	}
-
-	strbuf_addf(buf, "%s %s %s %s\n",
-		    uname_info.sysname,
-		    uname_info.release,
-		    uname_info.version,
-		    uname_info.machine);
+	if (full)
+		strbuf_addf(buf, "%s %s %s %s\n",
+			    uname_info.sysname,
+			    uname_info.release,
+			    uname_info.version,
+			    uname_info.machine);
+	else
+	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
diff --git a/version.h b/version.h
index afe3dbbab7..5eb586c0bd 100644
--- a/version.h
+++ b/version.h
@@ -12,6 +12,6 @@ const char *git_user_agent_sanitized(void);
   Return -1 and put an error message into 'buf' in case of uname()
   error. Return 0 and put uname info into 'buf' otherwise.
 */
-int get_uname_info(struct strbuf *buf);
+int get_uname_info(struct strbuf *buf, unsigned int full);
 
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v5 5/6] t5701: add setup test to remove side-effect dependency
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                             ` (3 preceding siblings ...)
  2025-02-14 12:36           ` [PATCH v5 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
@ 2025-02-14 12:36           ` Usman Akinyemi
  2025-02-14 21:49             ` Junio C Hamano
  2025-02-14 12:36           ` [PATCH v5 6/6] agent: advertise OS name via agent capability Usman Akinyemi
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-14 12:36 UTC (permalink / raw)
  To: christian.couder, git
  Cc: Johannes.Schindelin, gitster, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Currently, the "test capability advertisement" test creates some files
with expected content which are used by other tests below it.

To remove that side-effect from this test, let's split up part of
it into a "setup"-type test which creates the files with expected content
which gets reused by multiple tests. This will be useful in a following
commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 t/t5701-git-serve.sh | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index de904c1655..4c24a188b9 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -7,22 +7,28 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
-test_expect_success 'test capability advertisement' '
+test_expect_success 'setup to generate files with expected content' '
+	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
+
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
+
 	cat >expect.base <<-EOF &&
 	version 2
-	agent=git/$(git version | cut -d" " -f3)
+	$(cat agent_capability)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
 	object-format=$(test_oid algo)
 	EOF
-	cat >expect.trailer <<-EOF &&
+	cat >expect.trailer <<-EOF
 	0000
 	EOF
+'
+
+test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v5 6/6] agent: advertise OS name via agent capability
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                             ` (4 preceding siblings ...)
  2025-02-14 12:36           ` [PATCH v5 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-02-14 12:36           ` Usman Akinyemi
  2025-02-14 22:07             ` Junio C Hamano
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-14 12:36 UTC (permalink / raw)
  To: christian.couder, git
  Cc: Johannes.Schindelin, gitster, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality. The operating system name is retrieved using the 'sysname'
field of the `uname(2)` system call or its equivalent.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

On Windows, uname(2) is not actually system-supplied but is instead
already faked up by Git itself. We could have overcome the test issue
on Windows by implementing a new `uname` subcommand in `test-tool`
using uname(2), but except uname(2), which would be tested against
itself, there would be nothing platform specific, so it's just simpler
to disable the tests on Windows.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/gitprotocol-v2.txt | 13 ++++++++-----
 t/t5701-git-serve.sh             | 16 +++++++++++++++-
 t/test-lib-functions.sh          |  8 ++++++++
 version.c                        | 29 ++++++++++++++++++++++++++++-
 version.h                        |  3 +++
 5 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 1652fef3ae..f4831a8787 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version
 the `agent` capability with a value `Y` (in the form `agent=Y`) in its
 request to the server (but it MUST NOT do so if the server did not
 advertise the agent capability). The `X` and `Y` strings may contain any
-printable ASCII characters except space (i.e., the byte range 32 < x <
-127), and are typically of the form "package/version" (e.g.,
-"git/1.8.3.1"). The agent strings are purely informative for statistics
-and debugging purposes, and MUST NOT be used to programmatically assume
-the presence or absence of particular features.
+printable ASCII characters (i.e., the byte range 31 < x < 127), and are
+typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")
+where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
+be configured using the GIT_USER_AGENT environment variable and it takes
+priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
+system call or its equivalent. The agent strings are purely informative for
+statistics and debugging purposes, and MUST NOT be used to programmatically
+assume the presence or absence of particular features.
 
 ls-refs
 ~~~~~~~
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 4c24a188b9..4f0b053c4a 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -8,13 +8,19 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 . ./test-lib.sh
 
 test_expect_success 'setup to generate files with expected content' '
-	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
+	printf "agent=git/%s" "$(git version | cut -d" " -f3)" >agent_capability &&
 
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
 
+	if test_have_prereq WINDOWS
+	then
+		printf "agent=FAKE\n" >agent_capability
+	else
+		printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability
+	fi &&
 	cat >expect.base <<-EOF &&
 	version 2
 	$(cat agent_capability)
@@ -31,6 +37,10 @@ test_expect_success 'setup to generate files with expected content' '
 test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
+	if test_have_prereq WINDOWS
+	then
+		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
+	fi &&
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
 	test-tool pkt-line unpack <out >actual &&
@@ -361,6 +371,10 @@ test_expect_success 'test capability advertisement with uploadpack.advertiseBund
 	    expect.extra \
 	    expect.trailer >expect &&
 
+	if test_have_prereq WINDOWS
+	then
+		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
+	fi &&
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
 	test-tool pkt-line unpack <out >actual &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..3465904323 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -2007,3 +2007,11 @@ test_trailing_hash () {
 		test-tool hexdump |
 		sed "s/ //g"
 }
+
+# Trim and replace each character with ascii code below 32 or above
+# 127 (included) using a dot '.' character.
+# Octal intervals \001-\040 and \177-\377
+# correspond to decimal intervals 1-32 and 127-255
+test_redact_non_printables () {
+    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
+}
diff --git a/version.c b/version.c
index d95221a72a..027ebc82b4 100644
--- a/version.c
+++ b/version.c
@@ -1,8 +1,9 @@
+#define USE_THE_REPOSITORY_VARIABLE
+
 #include "git-compat-util.h"
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
-#include "sane-ctype.h"
 #include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
@@ -34,6 +35,27 @@ const char *git_user_agent(void)
 	return agent;
 }
 
+/*
+  Retrieve, sanitize and cache operating system info for subsequent
+  calls. Return a pointer to the sanitized operating system info
+  string.
+*/
+static const char *os_info(void)
+{
+	static const char *os = NULL;
+
+	if (!os) {
+		struct strbuf buf = STRBUF_INIT;
+
+		get_uname_info(&buf, 0);
+		/* Sanitize the os information immediately */
+		redact_non_printables(&buf);
+		os = strbuf_detach(&buf, NULL);
+	}
+
+	return os;
+}
+
 const char *git_user_agent_sanitized(void)
 {
 	static const char *agent = NULL;
@@ -43,6 +65,11 @@ const char *git_user_agent_sanitized(void)
 
 		strbuf_addstr(&buf, git_user_agent());
 		redact_non_printables(&buf);
+
+		if (!getenv("GIT_USER_AGENT")) {
+			strbuf_addch(&buf, ' ');
+			strbuf_addstr(&buf, os_info());
+		}
 		agent = strbuf_detach(&buf, NULL);
 	}
 
diff --git a/version.h b/version.h
index 5eb586c0bd..bbde6d371a 100644
--- a/version.h
+++ b/version.h
@@ -1,6 +1,8 @@
 #ifndef VERSION_H
 #define VERSION_H
 
+struct repository;
+
 extern const char git_version_string[];
 extern const char git_built_from_commit_string[];
 
@@ -14,4 +16,5 @@ const char *git_user_agent_sanitized(void);
 */
 int get_uname_info(struct strbuf *buf, unsigned int full);
 
+
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH v5 5/6] t5701: add setup test to remove side-effect dependency
  2025-02-14 12:36           ` [PATCH v5 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-02-14 21:49             ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-02-14 21:49 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: christian.couder, git, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Currently, the "test capability advertisement" test creates some files
> with expected content which are used by other tests below it.
>
> To remove that side-effect from this test, let's split up part of
> it into a "setup"-type test which creates the files with expected content
> which gets reused by multiple tests. This will be useful in a following
> commit.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
> ---
>  t/t5701-git-serve.sh | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)

Up to this step, everything looked very good.


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v5 6/6] agent: advertise OS name via agent capability
  2025-02-14 12:36           ` [PATCH v5 6/6] agent: advertise OS name via agent capability Usman Akinyemi
@ 2025-02-14 22:07             ` Junio C Hamano
  2025-02-15 15:29               ` Usman Akinyemi
  0 siblings, 1 reply; 108+ messages in thread
From: Junio C Hamano @ 2025-02-14 22:07 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: christian.couder, git, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> As some issues that can happen with a Git client can be operating system
> specific, it can be useful for a server to know which OS a client is
> using. In the same way it can be useful for a client to know which OS
> a server is using.
>
> Our current agent capability is in the form of "package/version" (e.g.,
> "git/1.8.3.1"). Let's extend it to include the operating system name (os)
> i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").

Shouldn't this be "git/1.8.3.1-Linux" or something to avoid SP?  The
capability list in protocol v1 is on a single line that is whitespace 
separated (cf. connect.c:parse_feature_value()) without any escape
mechanism.

	Side note.  Does it pose a security hole, when we can set
	agent to any value?  I do not think so, as it controls what
	this end sends to the other.  If you are attacker in control
	of your own agent string to be sent to the other end, and
	use a string with a whitespace in it after "agent=" to claim
	that you support a capability you actually don't, that is
	not a new way to attack the other side available to you---you
	can write your own Git client to talk to the other side to
	send such a bogus capablity list anyway.

> diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> index 1652fef3ae..f4831a8787 100644
> --- a/Documentation/gitprotocol-v2.txt
> +++ b/Documentation/gitprotocol-v2.txt
> @@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version
>  the `agent` capability with a value `Y` (in the form `agent=Y`) in its
>  request to the server (but it MUST NOT do so if the server did not
>  advertise the agent capability). The `X` and `Y` strings may contain any
> -printable ASCII characters except space (i.e., the byte range 32 < x <
> -127), and are typically of the form "package/version" (e.g.,
> -"git/1.8.3.1"). The agent strings are purely informative for statistics
> -and debugging purposes, and MUST NOT be used to programmatically assume
> -the presence or absence of particular features.
> +printable ASCII characters (i.e., the byte range 31 < x < 127), and are

Patches 1 & 2 redacted non-printables and SP separately, because SP
is considered printable.  With this change you are allowing SP to be
passed without getting redacted?  I do not think it is a good idea
(see above).

While I'd prefer to keep the range the same as before, i.e. "any
printable ASCII characters except space", "33 <= x <= 126" may be
more readily recognisable that we are doing something unusual, as
"32 <= x <= 126" is fairly easily recognisable as "ASCII printable".

> +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")

So, I'd suggest using something other than " " between "version" and
"os".  Dot (as if the byte there were redacted) or slash or dash or
whatever, anything that is not whitespace.

> +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
> +be configured using the GIT_USER_AGENT environment variable and it takes
> +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
> +system call or its equivalent. The agent strings are purely informative for
> +statistics and debugging purposes, and MUST NOT be used to programmatically
> +assume the presence or absence of particular features.

Other than these nits, I find the above very well done.

As to the additional implementation of git_user_agent_sanitized(),
except for that same "do we really want SP there?" question, I see
nothing questionable there, either.

Overall very nicely done and presented.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v5 6/6] agent: advertise OS name via agent capability
  2025-02-14 22:07             ` Junio C Hamano
@ 2025-02-15 15:29               ` Usman Akinyemi
  0 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: christian.couder, git, Johannes.Schindelin, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

On Sat, Feb 15, 2025 at 3:37 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Usman Akinyemi <usmanakinyemi202@gmail.com> writes:
Hi Junio,
>
> > As some issues that can happen with a Git client can be operating system
> > specific, it can be useful for a server to know which OS a client is
> > using. In the same way it can be useful for a client to know which OS
> > a server is using.
> >
> > Our current agent capability is in the form of "package/version" (e.g.,
> > "git/1.8.3.1"). Let's extend it to include the operating system name (os)
> > i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
>
> Shouldn't this be "git/1.8.3.1-Linux" or something to avoid SP?  The
> capability list in protocol v1 is on a single line that is whitespace
> separated (cf. connect.c:parse_feature_value()) without any escape
> mechanism.
Yeah, I almost missed this function. Thanks for pointing it out.
>
>         Side note.  Does it pose a security hole, when we can set
>         agent to any value?  I do not think so, as it controls what
>         this end sends to the other.  If you are attacker in control
>         of your own agent string to be sent to the other end, and
>         use a string with a whitespace in it after "agent=" to claim
>         that you support a capability you actually don't, that is
>         not a new way to attack the other side available to you---you
>         can write your own Git client to talk to the other side to
>         send such a bogus capablity list anyway.
Thanks for this explanation.
>
> > diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> > index 1652fef3ae..f4831a8787 100644
> > --- a/Documentation/gitprotocol-v2.txt
> > +++ b/Documentation/gitprotocol-v2.txt
> > @@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version
> >  the `agent` capability with a value `Y` (in the form `agent=Y`) in its
> >  request to the server (but it MUST NOT do so if the server did not
> >  advertise the agent capability). The `X` and `Y` strings may contain any
> > -printable ASCII characters except space (i.e., the byte range 32 < x <
> > -127), and are typically of the form "package/version" (e.g.,
> > -"git/1.8.3.1"). The agent strings are purely informative for statistics
> > -and debugging purposes, and MUST NOT be used to programmatically assume
> > -the presence or absence of particular features.
> > +printable ASCII characters (i.e., the byte range 31 < x < 127), and are
>
> Patches 1 & 2 redacted non-printables and SP separately, because SP
> is considered printable.  With this change you are allowing SP to be
> passed without getting redacted?  I do not think it is a good idea
> (see above).
>
> While I'd prefer to keep the range the same as before, i.e. "any
> printable ASCII characters except space", "33 <= x <= 126" may be
> more readily recognisable that we are doing something unusual, as
> "32 <= x <= 126" is fairly easily recognisable as "ASCII printable".
>
> > +typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")
>
> So, I'd suggest using something other than " " between "version" and
> "os".  Dot (as if the byte there were redacted) or slash or dash or
> whatever, anything that is not whitespace.
Yeah, Noted. Thanks.
>
> > +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
> > +be configured using the GIT_USER_AGENT environment variable and it takes
> > +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
> > +system call or its equivalent. The agent strings are purely informative for
> > +statistics and debugging purposes, and MUST NOT be used to programmatically
> > +assume the presence or absence of particular features.
>
> Other than these nits, I find the above very well done.
>
> As to the additional implementation of git_user_agent_sanitized(),
> except for that same "do we really want SP there?" question, I see
> nothing questionable there, either.
>
> Overall very nicely done and presented.
Thank you.
>
> Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v6 0/6][Outreachy] extend agent capability to include OS name
  2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                             ` (5 preceding siblings ...)
  2025-02-14 12:36           ` [PATCH v5 6/6] agent: advertise OS name via agent capability Usman Akinyemi
@ 2025-02-15 15:50           ` Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
                               ` (6 more replies)
  6 siblings, 7 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:50 UTC (permalink / raw)
  To: christian.couder, gitster
  Cc: Johannes.Schindelin, git, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine

For debugging, statistical analysis, and security purposes, it can
be valuable for Git servers to know the operating system the clients
are using.

For example:
- A server noticing that a client is using an old Git version with
security issues on one platform, like macOS, could verify if the
user is indeed running macOS before sending a message to upgrade."
- Similarly, a server identifying a client that could benefit from
an upgrade (e.g., for performance reasons) could better customize the
message it sends to nudge the client to upgrade.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version-os" (e.g., "git/1.8.3.1-Linux").
The operating system name is retrieved using the 'sysname' field of 
he `uname(2)` system call or its equivalent.

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality.

Note that, due to differences between `uname(1)` (command-line
utility) and `uname(2)` (system call) outputs on Windows,
`transfer.advertiseOSVersion` is set to false on Windows during
testing. See the message part of patch 5/6 for more details.

My mentor, Christian Couder, sent a previous patch series about this
before. You can find it here
https://lore.kernel.org/git/20240619125708.3719150-1-christian.couder@gmail.com/

Changes since v5
================
 - Used "-" instead of " " for seperating "version" and "os" in the agent string.

Usman Akinyemi (6):
  version: replace manual ASCII checks with isprint() for clarity
  version: refactor redact_non_printables()
  version: refactor get_uname_info()
  version: extend get_uname_info() to hide system details
  t5701: add setup test to remove side-effect dependency
  agent: advertise OS name via agent capability

 Documentation/gitprotocol-v2.txt | 13 +++---
 builtin/bugreport.c              | 13 +-----
 connect.c                        |  2 +-
 t/t5701-git-serve.sh             | 26 ++++++++++--
 t/test-lib-functions.sh          |  8 ++++
 version.c                        | 69 +++++++++++++++++++++++++++++---
 version.h                        | 10 +++++
 7 files changed, 116 insertions(+), 25 deletions(-)

Range-diff versus v5:

1:  82b62c5e66 = 1:  82b62c5e66 version: replace manual ASCII checks with isprint() for clarity
2:  0a7d7ce871 = 2:  0a7d7ce871 version: refactor redact_non_printables()
3:  0187db59a4 = 3:  0187db59a4 version: refactor get_uname_info()
4:  d3a3573594 = 4:  d3a3573594 version: extend get_uname_info() to hide system details
5:  3e0e98f23d = 5:  3e0e98f23d t5701: add setup test to remove side-effect dependency
6:  8878e9c9ab ! 6:  48cf946f61 agent: advertise OS name via agent capability
    @@ Commit message
     
         Our current agent capability is in the form of "package/version" (e.g.,
         "git/1.8.3.1"). Let's extend it to include the operating system name (os)
    -    i.e in the form "package/version os" (e.g., "git/1.8.3.1 Linux").
    +    i.e in the form "package/version-os" (e.g., "git/1.8.3.1-Linux").
     
         Including OS details in the agent capability simplifies implementation,
         maintains backward compatibility, avoids introducing a new capability,
    @@ Documentation/gitprotocol-v2.txt: form `agent=X`) to notify the client that the
     -"git/1.8.3.1"). The agent strings are purely informative for statistics
     -and debugging purposes, and MUST NOT be used to programmatically assume
     -the presence or absence of particular features.
    -+printable ASCII characters (i.e., the byte range 31 < x < 127), and are
    -+typically of the form "package/version os" (e.g., "git/1.8.3.1 Linux")
    ++printable ASCII characters (i.e., the byte range 33 <= x <= 126), and are
    ++typically of the form "package/version-os" (e.g., "git/1.8.3.1-Linux")
     +where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
     +be configured using the GIT_USER_AGENT environment variable and it takes
     +priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
    @@ Documentation/gitprotocol-v2.txt: form `agent=X`) to notify the client that the
      ls-refs
      ~~~~~~~
     
    + ## connect.c ##
    +@@ connect.c: const char *parse_feature_value(const char *feature_list, const char *feature, s
    + 					*offset = found + len - orig_start;
    + 				return value;
    + 			}
    +-			/* feature with a value (e.g., "agent=git/1.2.3") */
    ++			/* feature with a value (e.g., "agent=git/1.2.3-Linux") */
    + 			else if (*value == '=') {
    + 				size_t end;
    + 
    +
      ## t/t5701-git-serve.sh ##
     @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      . ./test-lib.sh
    @@ t/t5701-git-serve.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +	then
     +		printf "agent=FAKE\n" >agent_capability
     +	else
    -+		printf " %s\n" $(uname -s | test_redact_non_printables) >>agent_capability
    ++		printf -- "-%s\n" $(uname -s | test_redact_non_printables) >>agent_capability
     +	fi &&
      	cat >expect.base <<-EOF &&
      	version 2
    @@ version.c
      #include "gettext.h"
      
      const char git_version_string[] = GIT_VERSION;
    -@@ version.c: const char *git_user_agent_sanitized(void)
    - 
    - 		strbuf_addstr(&buf, git_user_agent());
    - 		redact_non_printables(&buf);
    -+
    -+		if (!getenv("GIT_USER_AGENT")) {
    -+			strbuf_addch(&buf, ' ');
    -+			strbuf_addstr(&buf, os_info());
    -+		}
    - 		agent = strbuf_detach(&buf, NULL);
    - 	}
    - 
    -@@ version.c: int get_uname_info(struct strbuf *buf, unsigned int full)
    - 	     strbuf_addf(buf, "%s\n", uname_info.sysname);
    - 	return 0;
    +@@ version.c: const char *git_user_agent(void)
    + 	return agent;
      }
    -+
    -+const char *os_info(void)
    + 
    ++/*
    ++  Retrieve, sanitize and cache operating system info for subsequent
    ++  calls. Return a pointer to the sanitized operating system info
    ++  string.
    ++*/
    ++static const char *os_info(void)
     +{
     +	static const char *os = NULL;
     +
    @@ version.c: int get_uname_info(struct strbuf *buf, unsigned int full)
     +
     +	return os;
     +}
    ++
    + const char *git_user_agent_sanitized(void)
    + {
    + 	static const char *agent = NULL;
    +@@ version.c: const char *git_user_agent_sanitized(void)
    + 		struct strbuf buf = STRBUF_INIT;
    + 
    + 		strbuf_addstr(&buf, git_user_agent());
    ++
    ++		if (!getenv("GIT_USER_AGENT")) {
    ++			strbuf_addch(&buf, '-');
    ++			strbuf_addstr(&buf, os_info());
    ++		}
    + 		redact_non_printables(&buf);
    + 		agent = strbuf_detach(&buf, NULL);
    + 	}
     
      ## version.h ##
     @@
    @@ version.h: const char *git_user_agent_sanitized(void);
      */
      int get_uname_info(struct strbuf *buf, unsigned int full);
      
    -+/*
    -+  Retrieve, sanitize and cache operating system info for subsequent
    -+  calls. Return a pointer to the sanitized operating system info
    -+  string.
    -+*/
    -+const char *os_info(void);
     +
      #endif /* VERSION_H */

-- 
2.48.1


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v6 1/6] version: replace manual ASCII checks with isprint() for clarity
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
@ 2025-02-15 15:50             ` Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 2/6] version: refactor redact_non_printables() Usman Akinyemi
                               ` (5 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:50 UTC (permalink / raw)
  To: christian.couder, gitster
  Cc: Johannes.Schindelin, git, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Since the isprint() function checks for printable characters, let's
replace the existing hardcoded ASCII checks with it. However, since
the original checks also handled spaces, we need to account for spaces
explicitly in the new check.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/version.c b/version.c
index 4d763ab48d..6cfbb8ca56 100644
--- a/version.c
+++ b/version.c
@@ -2,6 +2,7 @@
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
+#include "sane-ctype.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -29,7 +30,7 @@ const char *git_user_agent_sanitized(void)
 		strbuf_addstr(&buf, git_user_agent());
 		strbuf_trim(&buf);
 		for (size_t i = 0; i < buf.len; i++) {
-			if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
+			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
 				buf.buf[i] = '.';
 		}
 		agent = buf.buf;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v6 2/6] version: refactor redact_non_printables()
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
@ 2025-02-15 15:50             ` Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 3/6] version: refactor get_uname_info() Usman Akinyemi
                               ` (4 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:50 UTC (permalink / raw)
  To: christian.couder, gitster
  Cc: Johannes.Schindelin, git, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

The git_user_agent_sanitized() function performs some sanitizing to
avoid special characters being sent over the line and possibly messing
up with the protocol or with the parsing on the other side.

Let's extract this sanitizing into a new redact_non_printables() function,
as we will want to reuse it in a following patch.

For now the new redact_non_printables() function is still static as
it's only needed locally.

While at it, let's use strbuf_detach() to explicitly detach the string
contained by the 'buf' strbuf.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 version.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/version.c b/version.c
index 6cfbb8ca56..60df71fd0e 100644
--- a/version.c
+++ b/version.c
@@ -7,6 +7,19 @@
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
 
+/*
+ * Trim and replace each character with ascii code below 32 or above
+ * 127 (included) using a dot '.' character.
+ */
+static void redact_non_printables(struct strbuf *buf)
+{
+	strbuf_trim(buf);
+	for (size_t i = 0; i < buf->len; i++) {
+		if (!isprint(buf->buf[i]) || buf->buf[i] == ' ')
+			buf->buf[i] = '.';
+	}
+}
+
 const char *git_user_agent(void)
 {
 	static const char *agent = NULL;
@@ -28,12 +41,8 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
-		strbuf_trim(&buf);
-		for (size_t i = 0; i < buf.len; i++) {
-			if (!isprint(buf.buf[i]) || buf.buf[i] == ' ')
-				buf.buf[i] = '.';
-		}
-		agent = buf.buf;
+		redact_non_printables(&buf);
+		agent = strbuf_detach(&buf, NULL);
 	}
 
 	return agent;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v6 3/6] version: refactor get_uname_info()
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 2/6] version: refactor redact_non_printables() Usman Akinyemi
@ 2025-02-15 15:50             ` Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
                               ` (3 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:50 UTC (permalink / raw)
  To: christian.couder, gitster
  Cc: Johannes.Schindelin, git, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c | 13 ++-----------
 version.c           | 20 ++++++++++++++++++++
 version.h           |  7 +++++++
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 7c2df035c9..5e13d532a8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -12,10 +12,10 @@
 #include "diagnose.h"
 #include "object-file.h"
 #include "setup.h"
+#include "version.h"
 
 static void get_system_info(struct strbuf *sys_info)
 {
-	struct utsname uname_info;
 	char *shell = NULL;
 
 	/* get git version from native cmd */
@@ -24,16 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	if (uname(&uname_info))
-		strbuf_addf(sys_info, _("uname() failed with error '%s' (%d)\n"),
-			    strerror(errno),
-			    errno);
-	else
-		strbuf_addf(sys_info, "%s %s %s %s\n",
-			    uname_info.sysname,
-			    uname_info.release,
-			    uname_info.version,
-			    uname_info.machine);
+	get_uname_info(sys_info);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 60df71fd0e..3ec8b8243d 100644
--- a/version.c
+++ b/version.c
@@ -3,6 +3,7 @@
 #include "version-def.h"
 #include "strbuf.h"
 #include "sane-ctype.h"
+#include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
 const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
@@ -47,3 +48,22 @@ const char *git_user_agent_sanitized(void)
 
 	return agent;
 }
+
+int get_uname_info(struct strbuf *buf)
+{
+	struct utsname uname_info;
+
+	if (uname(&uname_info)) {
+		strbuf_addf(buf, _("uname() failed with error '%s' (%d)\n"),
+			    strerror(errno),
+			    errno);
+		return -1;
+	}
+
+	strbuf_addf(buf, "%s %s %s %s\n",
+		    uname_info.sysname,
+		    uname_info.release,
+		    uname_info.version,
+		    uname_info.machine);
+	return 0;
+}
diff --git a/version.h b/version.h
index 7c62e80577..afe3dbbab7 100644
--- a/version.h
+++ b/version.h
@@ -7,4 +7,11 @@ extern const char git_built_from_commit_string[];
 const char *git_user_agent(void);
 const char *git_user_agent_sanitized(void);
 
+/*
+  Try to get information about the system using uname(2).
+  Return -1 and put an error message into 'buf' in case of uname()
+  error. Return 0 and put uname info into 'buf' otherwise.
+*/
+int get_uname_info(struct strbuf *buf);
+
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v6 4/6] version: extend get_uname_info() to hide system details
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                               ` (2 preceding siblings ...)
  2025-02-15 15:50             ` [PATCH v6 3/6] version: refactor get_uname_info() Usman Akinyemi
@ 2025-02-15 15:50             ` Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
                               ` (2 subsequent siblings)
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:50 UTC (permalink / raw)
  To: christian.couder, gitster
  Cc: Johannes.Schindelin, git, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Currently, get_uname_info() function provides the full OS information.
In a following commit, we will need it to provide only the OS name.

Let's extend it to accept a "full" flag that makes it switch between
providing full OS information and providing only the OS name.

We may need to refactor this function in the future if an
`osVersion.format` is added.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 builtin/bugreport.c |  2 +-
 version.c           | 16 +++++++++-------
 version.h           |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 5e13d532a8..e3288a86c8 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -24,7 +24,7 @@ static void get_system_info(struct strbuf *sys_info)
 
 	/* system call for other version info */
 	strbuf_addstr(sys_info, "uname: ");
-	get_uname_info(sys_info);
+	get_uname_info(sys_info, 1);
 
 	strbuf_addstr(sys_info, _("compiler info: "));
 	get_compiler_info(sys_info);
diff --git a/version.c b/version.c
index 3ec8b8243d..d95221a72a 100644
--- a/version.c
+++ b/version.c
@@ -49,7 +49,7 @@ const char *git_user_agent_sanitized(void)
 	return agent;
 }
 
-int get_uname_info(struct strbuf *buf)
+int get_uname_info(struct strbuf *buf, unsigned int full)
 {
 	struct utsname uname_info;
 
@@ -59,11 +59,13 @@ int get_uname_info(struct strbuf *buf)
 			    errno);
 		return -1;
 	}
-
-	strbuf_addf(buf, "%s %s %s %s\n",
-		    uname_info.sysname,
-		    uname_info.release,
-		    uname_info.version,
-		    uname_info.machine);
+	if (full)
+		strbuf_addf(buf, "%s %s %s %s\n",
+			    uname_info.sysname,
+			    uname_info.release,
+			    uname_info.version,
+			    uname_info.machine);
+	else
+	     strbuf_addf(buf, "%s\n", uname_info.sysname);
 	return 0;
 }
diff --git a/version.h b/version.h
index afe3dbbab7..5eb586c0bd 100644
--- a/version.h
+++ b/version.h
@@ -12,6 +12,6 @@ const char *git_user_agent_sanitized(void);
   Return -1 and put an error message into 'buf' in case of uname()
   error. Return 0 and put uname info into 'buf' otherwise.
 */
-int get_uname_info(struct strbuf *buf);
+int get_uname_info(struct strbuf *buf, unsigned int full);
 
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v6 5/6] t5701: add setup test to remove side-effect dependency
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                               ` (3 preceding siblings ...)
  2025-02-15 15:50             ` [PATCH v6 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
@ 2025-02-15 15:50             ` Usman Akinyemi
  2025-02-15 15:50             ` [PATCH v6 6/6] agent: advertise OS name via agent capability Usman Akinyemi
  2025-02-18 17:09             ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Junio C Hamano
  6 siblings, 0 replies; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:50 UTC (permalink / raw)
  To: christian.couder, gitster
  Cc: Johannes.Schindelin, git, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

Currently, the "test capability advertisement" test creates some files
with expected content which are used by other tests below it.

To remove that side-effect from this test, let's split up part of
it into a "setup"-type test which creates the files with expected content
which gets reused by multiple tests. This will be useful in a following
commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 t/t5701-git-serve.sh | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index de904c1655..4c24a188b9 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -7,22 +7,28 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
-test_expect_success 'test capability advertisement' '
+test_expect_success 'setup to generate files with expected content' '
+	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
+
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
+
 	cat >expect.base <<-EOF &&
 	version 2
-	agent=git/$(git version | cut -d" " -f3)
+	$(cat agent_capability)
 	ls-refs=unborn
 	fetch=shallow wait-for-done
 	server-option
 	object-format=$(test_oid algo)
 	EOF
-	cat >expect.trailer <<-EOF &&
+	cat >expect.trailer <<-EOF
 	0000
 	EOF
+'
+
+test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* [PATCH v6 6/6] agent: advertise OS name via agent capability
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                               ` (4 preceding siblings ...)
  2025-02-15 15:50             ` [PATCH v6 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
@ 2025-02-15 15:50             ` Usman Akinyemi
  2025-02-18 17:14               ` Junio C Hamano
  2025-02-18 17:09             ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Junio C Hamano
  6 siblings, 1 reply; 108+ messages in thread
From: Usman Akinyemi @ 2025-02-15 15:50 UTC (permalink / raw)
  To: christian.couder, gitster
  Cc: Johannes.Schindelin, git, johncai86, me, phillip.wood, ps,
	rsbecker, sunshine, Christian Couder

As some issues that can happen with a Git client can be operating system
specific, it can be useful for a server to know which OS a client is
using. In the same way it can be useful for a client to know which OS
a server is using.

Our current agent capability is in the form of "package/version" (e.g.,
"git/1.8.3.1"). Let's extend it to include the operating system name (os)
i.e in the form "package/version-os" (e.g., "git/1.8.3.1-Linux").

Including OS details in the agent capability simplifies implementation,
maintains backward compatibility, avoids introducing a new capability,
encourages adoption across Git-compatible software, and enhances
debugging by providing complete environment information without affecting
functionality. The operating system name is retrieved using the 'sysname'
field of the `uname(2)` system call or its equivalent.

However, there are differences between `uname(1)` (command-line utility)
and `uname(2)` (system call) outputs on Windows. These discrepancies
complicate testing on Windows platforms. For example:
  - `uname(1)` output: MINGW64_NT-10.0-20348.3.4.10-87d57229.x86_64\
  .2024-02-14.20:17.UTC.x86_64
  - `uname(2)` output: Windows.10.0.20348

On Windows, uname(2) is not actually system-supplied but is instead
already faked up by Git itself. We could have overcome the test issue
on Windows by implementing a new `uname` subcommand in `test-tool`
using uname(2), but except uname(2), which would be tested against
itself, there would be nothing platform specific, so it's just simpler
to disable the tests on Windows.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
---
 Documentation/gitprotocol-v2.txt | 13 ++++++++-----
 connect.c                        |  2 +-
 t/t5701-git-serve.sh             | 16 +++++++++++++++-
 t/test-lib-functions.sh          |  8 ++++++++
 version.c                        | 29 ++++++++++++++++++++++++++++-
 version.h                        |  3 +++
 6 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
index 1652fef3ae..ce4a4e5e3b 100644
--- a/Documentation/gitprotocol-v2.txt
+++ b/Documentation/gitprotocol-v2.txt
@@ -184,11 +184,14 @@ form `agent=X`) to notify the client that the server is running version
 the `agent` capability with a value `Y` (in the form `agent=Y`) in its
 request to the server (but it MUST NOT do so if the server did not
 advertise the agent capability). The `X` and `Y` strings may contain any
-printable ASCII characters except space (i.e., the byte range 32 < x <
-127), and are typically of the form "package/version" (e.g.,
-"git/1.8.3.1"). The agent strings are purely informative for statistics
-and debugging purposes, and MUST NOT be used to programmatically assume
-the presence or absence of particular features.
+printable ASCII characters (i.e., the byte range 33 <= x <= 126), and are
+typically of the form "package/version-os" (e.g., "git/1.8.3.1-Linux")
+where `os` is the operating system name (e.g., "Linux"). `X` and `Y` can
+be configured using the GIT_USER_AGENT environment variable and it takes
+priority. The `os` is retrieved using the 'sysname' field of the `uname(2)`
+system call or its equivalent. The agent strings are purely informative for
+statistics and debugging purposes, and MUST NOT be used to programmatically
+assume the presence or absence of particular features.
 
 ls-refs
 ~~~~~~~
diff --git a/connect.c b/connect.c
index 10fad43e98..4d85479075 100644
--- a/connect.c
+++ b/connect.c
@@ -625,7 +625,7 @@ const char *parse_feature_value(const char *feature_list, const char *feature, s
 					*offset = found + len - orig_start;
 				return value;
 			}
-			/* feature with a value (e.g., "agent=git/1.2.3") */
+			/* feature with a value (e.g., "agent=git/1.2.3-Linux") */
 			else if (*value == '=') {
 				size_t end;
 
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 4c24a188b9..678a346ed0 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -8,13 +8,19 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 . ./test-lib.sh
 
 test_expect_success 'setup to generate files with expected content' '
-	printf "agent=git/%s\n" "$(git version | cut -d" " -f3)" >agent_capability &&
+	printf "agent=git/%s" "$(git version | cut -d" " -f3)" >agent_capability &&
 
 	test_oid_cache <<-EOF &&
 	wrong_algo sha1:sha256
 	wrong_algo sha256:sha1
 	EOF
 
+	if test_have_prereq WINDOWS
+	then
+		printf "agent=FAKE\n" >agent_capability
+	else
+		printf -- "-%s\n" $(uname -s | test_redact_non_printables) >>agent_capability
+	fi &&
 	cat >expect.base <<-EOF &&
 	version 2
 	$(cat agent_capability)
@@ -31,6 +37,10 @@ test_expect_success 'setup to generate files with expected content' '
 test_expect_success 'test capability advertisement' '
 	cat expect.base expect.trailer >expect &&
 
+	if test_have_prereq WINDOWS
+	then
+		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
+	fi &&
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
 	test-tool pkt-line unpack <out >actual &&
@@ -361,6 +371,10 @@ test_expect_success 'test capability advertisement with uploadpack.advertiseBund
 	    expect.extra \
 	    expect.trailer >expect &&
 
+	if test_have_prereq WINDOWS
+	then
+		GIT_USER_AGENT=FAKE && export GIT_USER_AGENT
+	fi &&
 	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
 		--advertise-capabilities >out &&
 	test-tool pkt-line unpack <out >actual &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..3465904323 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -2007,3 +2007,11 @@ test_trailing_hash () {
 		test-tool hexdump |
 		sed "s/ //g"
 }
+
+# Trim and replace each character with ascii code below 32 or above
+# 127 (included) using a dot '.' character.
+# Octal intervals \001-\040 and \177-\377
+# correspond to decimal intervals 1-32 and 127-255
+test_redact_non_printables () {
+    tr -d "\n\r" | tr "[\001-\040][\177-\377]" "."
+}
diff --git a/version.c b/version.c
index d95221a72a..8e927cf1eb 100644
--- a/version.c
+++ b/version.c
@@ -1,8 +1,9 @@
+#define USE_THE_REPOSITORY_VARIABLE
+
 #include "git-compat-util.h"
 #include "version.h"
 #include "version-def.h"
 #include "strbuf.h"
-#include "sane-ctype.h"
 #include "gettext.h"
 
 const char git_version_string[] = GIT_VERSION;
@@ -34,6 +35,27 @@ const char *git_user_agent(void)
 	return agent;
 }
 
+/*
+  Retrieve, sanitize and cache operating system info for subsequent
+  calls. Return a pointer to the sanitized operating system info
+  string.
+*/
+static const char *os_info(void)
+{
+	static const char *os = NULL;
+
+	if (!os) {
+		struct strbuf buf = STRBUF_INIT;
+
+		get_uname_info(&buf, 0);
+		/* Sanitize the os information immediately */
+		redact_non_printables(&buf);
+		os = strbuf_detach(&buf, NULL);
+	}
+
+	return os;
+}
+
 const char *git_user_agent_sanitized(void)
 {
 	static const char *agent = NULL;
@@ -42,6 +64,11 @@ const char *git_user_agent_sanitized(void)
 		struct strbuf buf = STRBUF_INIT;
 
 		strbuf_addstr(&buf, git_user_agent());
+
+		if (!getenv("GIT_USER_AGENT")) {
+			strbuf_addch(&buf, '-');
+			strbuf_addstr(&buf, os_info());
+		}
 		redact_non_printables(&buf);
 		agent = strbuf_detach(&buf, NULL);
 	}
diff --git a/version.h b/version.h
index 5eb586c0bd..bbde6d371a 100644
--- a/version.h
+++ b/version.h
@@ -1,6 +1,8 @@
 #ifndef VERSION_H
 #define VERSION_H
 
+struct repository;
+
 extern const char git_version_string[];
 extern const char git_built_from_commit_string[];
 
@@ -14,4 +16,5 @@ const char *git_user_agent_sanitized(void);
 */
 int get_uname_info(struct strbuf *buf, unsigned int full);
 
+
 #endif /* VERSION_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 108+ messages in thread

* Re: [PATCH v6 0/6][Outreachy] extend agent capability to include OS name
  2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
                               ` (5 preceding siblings ...)
  2025-02-15 15:50             ` [PATCH v6 6/6] agent: advertise OS name via agent capability Usman Akinyemi
@ 2025-02-18 17:09             ` Junio C Hamano
  6 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-02-18 17:09 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: christian.couder, Johannes.Schindelin, git, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> Changes since v5
> ================
>  - Used "-" instead of " " for seperating "version" and "os" in the agent string.
>
> Usman Akinyemi (6):
>   version: replace manual ASCII checks with isprint() for clarity
>   version: refactor redact_non_printables()
>   version: refactor get_uname_info()
>   version: extend get_uname_info() to hide system details
>   t5701: add setup test to remove side-effect dependency
>   agent: advertise OS name via agent capability

Overall everything looks good.  I spotted just one nit in the
protocol documentation update, which I'll comment on separately.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v6 6/6] agent: advertise OS name via agent capability
  2025-02-15 15:50             ` [PATCH v6 6/6] agent: advertise OS name via agent capability Usman Akinyemi
@ 2025-02-18 17:14               ` Junio C Hamano
  0 siblings, 0 replies; 108+ messages in thread
From: Junio C Hamano @ 2025-02-18 17:14 UTC (permalink / raw)
  To: Usman Akinyemi
  Cc: christian.couder, Johannes.Schindelin, git, johncai86, me,
	phillip.wood, ps, rsbecker, sunshine, Christian Couder

Usman Akinyemi <usmanakinyemi202@gmail.com> writes:

> diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> ...
>  advertise the agent capability). The `X` and `Y` strings may contain any
> -printable ASCII characters except space (i.e., the byte range 32 < x <
> -127), and are typically of the form "package/version" (e.g.,
> ...
> -the presence or absence of particular features.
> +printable ASCII characters (i.e., the byte range 33 <= x <= 126), and are
> +typically of the form "package/version-os" (e.g., "git/1.8.3.1-Linux")

THe above updates the way the byte range is expressed as inequality
but the series does not change the byte range itself.  Hence, "any
printable ASCII chavacters except space" should stay the same as-is,
without losing "except space", I would think.

No need to resend just to update this.

Thaskn.

^ permalink raw reply	[flat|nested] 108+ messages in thread

end of thread, other threads:[~2025-02-18 17:14 UTC | newest]

Thread overview: 108+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-06 10:30 [PATCH 0/4][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
2025-01-06 10:30 ` [PATCH 1/4] version: refactor redact_non_printables() Usman Akinyemi
2025-01-06 22:35   ` Eric Sunshine
2025-01-08 12:58     ` Usman Akinyemi
2025-01-06 10:30 ` [PATCH 2/4] version: refactor get_uname_info() Usman Akinyemi
2025-01-06 16:04   ` Junio C Hamano
2025-01-08 13:06     ` Usman Akinyemi
2025-01-06 10:30 ` [PATCH 3/4] connect: advertise OS version Usman Akinyemi
2025-01-06 16:22   ` Junio C Hamano
2025-01-08 13:06     ` Usman Akinyemi
2025-01-08 16:15       ` Junio C Hamano
2025-01-09 14:25         ` Usman Akinyemi
2025-01-09 15:46           ` Junio C Hamano
2025-01-10 17:56             ` Usman Akinyemi
2025-01-10 19:24               ` Junio C Hamano
2025-01-11 11:07                 ` Usman Akinyemi
2025-01-13 15:46                   ` Junio C Hamano
2025-01-13 18:26                     ` Usman Akinyemi
2025-01-13 19:47                       ` Junio C Hamano
2025-01-13 20:07                         ` rsbecker
2025-01-06 23:17   ` Eric Sunshine
2025-01-08 13:14     ` Usman Akinyemi
2025-01-06 10:30 ` [PATCH 4/4] version: introduce osversion.command config for os-version output Usman Akinyemi
2025-01-17 10:46 ` [PATCH v2 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
2025-01-17 10:46   ` [PATCH v2 1/6] version: refactor redact_non_printables() Usman Akinyemi
2025-01-17 18:26     ` Junio C Hamano
2025-01-17 19:48       ` Junio C Hamano
2025-01-20 17:10       ` Usman Akinyemi
2025-01-21  8:12         ` Christian Couder
2025-01-21 18:01           ` Junio C Hamano
2025-01-17 10:46   ` [PATCH v2 2/6] version: refactor get_uname_info() Usman Akinyemi
2025-01-17 10:46   ` [PATCH v2 3/6] version: extend get_uname_info() to hide system details Usman Akinyemi
2025-01-17 18:27     ` Junio C Hamano
2025-01-17 10:46   ` [PATCH v2 4/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
2025-01-17 19:31     ` Junio C Hamano
2025-01-20 17:32       ` Usman Akinyemi
2025-01-20 19:52         ` Junio C Hamano
2025-01-21 13:43           ` Usman Akinyemi
2025-01-17 10:46   ` [PATCH v2 5/6] connect: advertise OS version Usman Akinyemi
2025-01-17 19:35     ` Junio C Hamano
2025-01-17 22:22     ` Junio C Hamano
2025-01-17 22:47       ` rsbecker
2025-01-17 23:04         ` Junio C Hamano
2025-01-20 18:15       ` Usman Akinyemi
2025-01-21 19:06         ` Junio C Hamano
2025-01-17 10:46   ` [PATCH v2 6/6] version: introduce osversion.command config for os-version output Usman Akinyemi
2025-01-17 21:44     ` Eric Sunshine
2025-01-20 18:17       ` Usman Akinyemi
2025-01-20 18:41         ` Eric Sunshine
2025-01-20 19:08           ` Usman Akinyemi
2025-01-17 22:33     ` Junio C Hamano
2025-01-17 22:49       ` rsbecker
2025-01-17 23:06         ` Junio C Hamano
2025-01-17 23:18           ` rsbecker
2025-01-20 18:58       ` Usman Akinyemi
2025-01-21 19:14         ` Junio C Hamano
2025-01-21 19:51           ` rsbecker
2025-01-24 12:21   ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Usman Akinyemi
2025-01-24 12:21     ` [PATCH v3 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
2025-01-24 18:13       ` Junio C Hamano
2025-01-24 12:21     ` [PATCH v3 2/6] version: refactor redact_non_printables() Usman Akinyemi
2025-01-24 12:21     ` [PATCH v3 3/6] version: refactor get_uname_info() Usman Akinyemi
2025-01-24 12:21     ` [PATCH v3 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
2025-01-24 12:21     ` [PATCH v3 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
2025-01-24 18:12       ` Junio C Hamano
2025-01-24 12:21     ` [PATCH v3 6/6] connect: advertise OS version Usman Akinyemi
2025-02-05 18:52       ` [PATCH v4 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
2025-02-05 18:52         ` [PATCH v4 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
2025-02-05 18:52         ` [PATCH v4 2/6] version: refactor redact_non_printables() Usman Akinyemi
2025-02-05 18:52         ` [PATCH v4 3/6] version: refactor get_uname_info() Usman Akinyemi
2025-02-05 18:52         ` [PATCH v4 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
2025-02-05 18:52         ` [PATCH v4 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
2025-02-05 18:52         ` [PATCH v4 6/6] agent: advertise OS name via agent capability Usman Akinyemi
2025-02-05 21:48           ` Junio C Hamano
2025-02-06  6:37             ` Usman Akinyemi
2025-02-06 15:13               ` Junio C Hamano
2025-02-07 17:27                 ` Usman Akinyemi
2025-02-07 17:57                   ` Junio C Hamano
2025-02-07 19:25             ` Usman Akinyemi
2025-02-14 12:36         ` [PATCH v5 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
2025-02-14 12:36           ` [PATCH v5 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
2025-02-14 12:36           ` [PATCH v5 2/6] version: refactor redact_non_printables() Usman Akinyemi
2025-02-14 12:36           ` [PATCH v5 3/6] version: refactor get_uname_info() Usman Akinyemi
2025-02-14 12:36           ` [PATCH v5 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
2025-02-14 12:36           ` [PATCH v5 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
2025-02-14 21:49             ` Junio C Hamano
2025-02-14 12:36           ` [PATCH v5 6/6] agent: advertise OS name via agent capability Usman Akinyemi
2025-02-14 22:07             ` Junio C Hamano
2025-02-15 15:29               ` Usman Akinyemi
2025-02-15 15:50           ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Usman Akinyemi
2025-02-15 15:50             ` [PATCH v6 1/6] version: replace manual ASCII checks with isprint() for clarity Usman Akinyemi
2025-02-15 15:50             ` [PATCH v6 2/6] version: refactor redact_non_printables() Usman Akinyemi
2025-02-15 15:50             ` [PATCH v6 3/6] version: refactor get_uname_info() Usman Akinyemi
2025-02-15 15:50             ` [PATCH v6 4/6] version: extend get_uname_info() to hide system details Usman Akinyemi
2025-02-15 15:50             ` [PATCH v6 5/6] t5701: add setup test to remove side-effect dependency Usman Akinyemi
2025-02-15 15:50             ` [PATCH v6 6/6] agent: advertise OS name via agent capability Usman Akinyemi
2025-02-18 17:14               ` Junio C Hamano
2025-02-18 17:09             ` [PATCH v6 0/6][Outreachy] extend agent capability to include OS name Junio C Hamano
2025-01-24 18:39     ` [PATCH v3 0/6][Outreachy] Introduce os-version Capability with Configurable Options Junio C Hamano
2025-01-27 13:38       ` Christian Couder
2025-01-27 15:26         ` Junio C Hamano
2025-01-31 14:30           ` Christian Couder
2025-01-31 16:37             ` Junio C Hamano
2025-01-31 19:42               ` Usman Akinyemi
2025-01-31 20:15                 ` Junio C Hamano
2025-01-31 19:46               ` Usman Akinyemi
2025-01-31 20:17                 ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2024-12-06 12:42 [PATCH v3 0/5] Introduce a "promisor-remote" capability Christian Couder
2025-01-27 15:16 ` [PATCH v4 0/6] " Christian Couder
2025-01-27 15:16   ` [PATCH v4 2/6] version: refactor redact_non_printables() Christian Couder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).