* [PATCH v2 0/6] bswap.h: Rework ntohl handling
@ 2025-06-11 22:14 Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 1/6] Revert "bswap.h: add support for built-in bswap functions" Sebastian Andrzej Siewior
` (8 more replies)
0 siblings, 9 replies; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
Hi,
since the add of bswap32/64() support based on __builtin support, the
usage of ntohl() macros is broken on big endian machines because the
macros are always overwritten providing a swap.
The first patch in the series reverts the change and the following
patches try to improve the situation by allowing to always provide an
optimized version.
I've been looking at recent compiler and they manage to recognize the
manual shifting and use an optimize opcode if available. The ntohl
version provided by glibc already provides an "optimized" version which
makes an optimisation in git almost not needed.
One of the motivation behind overwriting/ providing an optimized
version was to provide a macro instead of using a function call. One
libc that is still providing ntohl as a function call is musl. So it
might makes sense to keep it.
While ntohl() is provided by the libc, the ntohll() is not. I found it
only on Windows provided by winsock.h.
I haven't touched the put/get_be*() macros. gcc & clang are both smart
enough to swap the content accordingly and perform a single store/ load.
Only the msvc seems to strugle here and performs multiple bytes stores/
loads and shifts.
Sebastian Andrzej Siewior (6):
Revert "bswap.h: add support for built-in bswap functions"
bswap.h: Add support for __BYTE_ORDER__
bswap.h: Define GIT_LITTLE_ENDIAN on msvc as little endian
bswap.h: Always overwrite ntohl/ ntohll macros
bswap.h: Remove optimized x86 version of bswap32/64
bswap.h: Provide a built-in based version of bswap32/64 if possible
compat/bswap.h | 118 ++++++++++++++++++-------------------------------
1 file changed, 44 insertions(+), 74 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 1/6] Revert "bswap.h: add support for built-in bswap functions"
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
@ 2025-06-11 22:14 ` Sebastian Andrzej Siewior
2025-06-12 20:04 ` Junio C Hamano
2025-06-11 22:14 ` [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__ Sebastian Andrzej Siewior
` (7 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
Since 6547d1c9 (bswap.h: add support for built-in bswap
functions, 2025-04-23) tweaked the way the bswap32/64 macros are
defined, on platforms with __builtin_bswap32/64 supported, the
bswap32/64 macros are defined even on big endian platforms.
However this file assumes that bswap31/64 are defined ONLY on
little endian machines and uses that assumption to redefine
ntohl/ntohll macros. The said commit broke t4014-format-patch.sh test,
among many others on s390x.
Revert the commit.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
compat/bswap.h | 14 +-------------
1 file changed, 1 insertion(+), 13 deletions(-)
diff --git a/compat/bswap.h b/compat/bswap.h
index 9e0f98e00b93a..b34054f2bd728 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -35,19 +35,7 @@ static inline uint64_t default_bswap64(uint64_t val)
#undef bswap32
#undef bswap64
-/**
- * __has_builtin is available since Clang 10 and GCC 10.
- * Below is a fallback for older compilers.
- */
-#ifndef __has_builtin
- #define __has_builtin(x) 0
-#endif
-
-#if __has_builtin(__builtin_bswap32) && __has_builtin(__builtin_bswap64)
-#define bswap32(x) __builtin_bswap32((x))
-#define bswap64(x) __builtin_bswap64((x))
-
-#elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
+#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
#define bswap32 git_bswap32
static inline uint32_t git_bswap32(uint32_t x)
--
2.49.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 1/6] Revert "bswap.h: add support for built-in bswap functions" Sebastian Andrzej Siewior
@ 2025-06-11 22:14 ` Sebastian Andrzej Siewior
2025-06-12 0:24 ` brian m. carlson
2025-06-12 20:10 ` Junio C Hamano
2025-06-11 22:14 ` [PATCH v2 3/6] bswap.h: Define GIT_LITTLE_ENDIAN on msvc as little endian Sebastian Andrzej Siewior
` (6 subsequent siblings)
8 siblings, 2 replies; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
The __BYTE_ORDER__ define is provided by gcc (since ~v4.6), clang
(since ~v3.2) and icc (since ~16.0.3). It is not provided by msvc as of
v19.43 / 17.13.6.
The __BYTE_ORDER and BYTE_ORDER macros are libc specific and are not
available on all supported platforms such as mingw.
Add support for the __BYTE_ORDER__ macro as a fallback.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
compat/bswap.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/compat/bswap.h b/compat/bswap.h
index b34054f2bd728..0a457542dd76a 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -116,6 +116,12 @@ static inline uint64_t git_bswap64(uint64_t x)
# define GIT_LITTLE_ENDIAN LITTLE_ENDIAN
# define GIT_BIG_ENDIAN BIG_ENDIAN
+#elif defined(__BYTE_ORDER__) && defined(__ORDER_LITTLE_ENDIAN__) && defined(__ORDER_BIG_ENDIAN__)
+
+# define GIT_BYTE_ORDER __BYTE_ORDER__
+# define GIT_LITTLE_ENDIAN __ORDER_LITTLE_ENDIAN__
+# define GIT_BIG_ENDIAN __ORDER_BIG_ENDIAN__
+
#else
# define GIT_BIG_ENDIAN 4321
--
2.49.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 3/6] bswap.h: Define GIT_LITTLE_ENDIAN on msvc as little endian
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 1/6] Revert "bswap.h: add support for built-in bswap functions" Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__ Sebastian Andrzej Siewior
@ 2025-06-11 22:14 ` Sebastian Andrzej Siewior
2025-06-12 20:11 ` Junio C Hamano
2025-06-11 22:14 ` [PATCH v2 4/6] bswap.h: Always overwrite ntohl/ ntohll macros Sebastian Andrzej Siewior
` (5 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
The Microsoft Visual C++ (MSVC) compiler (as of Visual Studio 2022
version 17.13.6) does not define __BYTE_ORDER__ and its C-library does
not define __BYTE_ORDER. The compiler is supported only an arm64 and x86
which are all little endian.
Define GIT_BYTE_ORDER on msvc as little endian to avoid further checks.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
compat/bswap.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/compat/bswap.h b/compat/bswap.h
index 0a457542dd76a..fd604d9f7b74b 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -81,6 +81,10 @@ static inline uint64_t git_bswap64(uint64_t x)
#define bswap32(x) _byteswap_ulong(x)
#define bswap64(x) _byteswap_uint64(x)
+#define GIT_LITTLE_ENDIAN 1234
+#define GIT_BIG_ENDIAN 4321
+#define GIT_BYTE_ORDER GIT_LITTLE_ENDIAN
+
#endif
#if defined(bswap32)
@@ -122,7 +126,7 @@ static inline uint64_t git_bswap64(uint64_t x)
# define GIT_LITTLE_ENDIAN __ORDER_LITTLE_ENDIAN__
# define GIT_BIG_ENDIAN __ORDER_BIG_ENDIAN__
-#else
+#elif !defined(GIT_BYTE_ORDER)
# define GIT_BIG_ENDIAN 4321
# define GIT_LITTLE_ENDIAN 1234
--
2.49.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 4/6] bswap.h: Always overwrite ntohl/ ntohll macros
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
` (2 preceding siblings ...)
2025-06-11 22:14 ` [PATCH v2 3/6] bswap.h: Define GIT_LITTLE_ENDIAN on msvc as little endian Sebastian Andrzej Siewior
@ 2025-06-11 22:14 ` Sebastian Andrzej Siewior
2025-06-12 0:14 ` brian m. carlson
2025-06-11 22:14 ` [PATCH v2 5/6] bswap.h: Remove optimized x86 version of bswap32/64 Sebastian Andrzej Siewior
` (4 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
The ntohl and htonl macros are redefined because the provided macros were
not always optimal. Sometimes it was a function call, sometimes it was a
macro which did the shifting. Using the 'bswap' opcode on x86 provides
probably better performance than performing the shifting.
These macros are only overwritten on x86 if the "optimized" version is
available.
The ntohll and htonll macros are not available on every platform (at
least glibc does not provide them) which means they need to be defined
once the endianness of the system is determined.
In order to get a more symmetrical setup, redfine the macros once the
endianness of the system has been determined.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
compat/bswap.h | 54 ++++++++++++++++++++++++--------------------------
1 file changed, 26 insertions(+), 28 deletions(-)
diff --git a/compat/bswap.h b/compat/bswap.h
index fd604d9f7b74b..aeef304f671f5 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -87,27 +87,6 @@ static inline uint64_t git_bswap64(uint64_t x)
#endif
-#if defined(bswap32)
-
-#undef ntohl
-#undef htonl
-#define ntohl(x) bswap32(x)
-#define htonl(x) bswap32(x)
-
-#endif
-
-#if defined(bswap64)
-
-#undef ntohll
-#undef htonll
-#define ntohll(x) bswap64(x)
-#define htonll(x) bswap64(x)
-
-#else
-
-#undef ntohll
-#undef htonll
-
#if defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && defined(__BIG_ENDIAN)
# define GIT_BYTE_ORDER __BYTE_ORDER
@@ -145,14 +124,33 @@ static inline uint64_t git_bswap64(uint64_t x)
#endif
-#if GIT_BYTE_ORDER == GIT_BIG_ENDIAN
-# define ntohll(n) (n)
-# define htonll(n) (n)
-#else
-# define ntohll(n) default_bswap64(n)
-# define htonll(n) default_bswap64(n)
-#endif
+#undef ntohl
+#undef htonl
+#undef ntohll
+#undef htonll
+#if GIT_BYTE_ORDER == GIT_BIG_ENDIAN
+# define ntohl(x) (x)
+# define htonl(x) (x)
+# define ntohll(x) (x)
+# define htonll(x) (x)
+#else
+
+# if defined(bswap32)
+# define ntohl(x) bswap32(x)
+# define htonl(x) bswap32(x)
+# else
+# define ntohl(x) default_swab32(x)
+# define htonl(x) default_swab32(x)
+# endif
+
+# if defined(bswap64)
+# define ntohll(x) bswap64(x)
+# define htonll(x) bswap64(x)
+# else
+# define ntohll(x) default_bswap64(x)
+# define htonll(x) default_bswap64(x)
+# endif
#endif
static inline uint16_t get_be16(const void *ptr)
--
2.49.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 5/6] bswap.h: Remove optimized x86 version of bswap32/64
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
` (3 preceding siblings ...)
2025-06-11 22:14 ` [PATCH v2 4/6] bswap.h: Always overwrite ntohl/ ntohll macros Sebastian Andrzej Siewior
@ 2025-06-11 22:14 ` Sebastian Andrzej Siewior
2025-06-26 15:55 ` Kristoffer Haugsbakk
2025-06-11 22:14 ` [PATCH v2 5/6] bswap: " Sebastian Andrzej Siewior
` (3 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
On x86 the bswap32/64 macro is implemenated based on the x86 opcode wich
performs the required shifting in just one opcode.
The other CPUs fallback to the generic shifting as implemented by
default_swab32() and default_bswap64() if needed.
I've been looking at how good a compiler is at recognizing the default
shift and emitting an optimized operation:
- x86, arm64 msvc v19.20
default_swab32() optimized
default_bswap64() shifts
_byteswap_uint64() otimized
- x86, arm64 msvc v19.37
default_swab32() optimized
default_bswap64() optimized
_byteswap_uint64() otimized
- arm64, gcc-4.9.4: optimized
- x86-64, gcc-4.4.7: shifts
- x86-64, gcc-4.5.3: optimized
- x86-64, clang-3.0: optimized
Given that gcc-4.5 and clang-3.0 are fairly old, any recent compiler
should recognize the shift.
Remove the optimized x86 version and rely on the compiler.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
compat/bswap.h | 41 +----------------------------------------
1 file changed, 1 insertion(+), 40 deletions(-)
diff --git a/compat/bswap.h b/compat/bswap.h
index aeef304f671f5..ed00f6d1d53f3 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -35,46 +35,7 @@ static inline uint64_t default_bswap64(uint64_t val)
#undef bswap32
#undef bswap64
-#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
-
-#define bswap32 git_bswap32
-static inline uint32_t git_bswap32(uint32_t x)
-{
- uint32_t result;
- if (__builtin_constant_p(x))
- result = default_swab32(x);
- else
- __asm__("bswap %0" : "=r" (result) : "0" (x));
- return result;
-}
-
-#define bswap64 git_bswap64
-#if defined(__x86_64__)
-static inline uint64_t git_bswap64(uint64_t x)
-{
- uint64_t result;
- if (__builtin_constant_p(x))
- result = default_bswap64(x);
- else
- __asm__("bswap %q0" : "=r" (result) : "0" (x));
- return result;
-}
-#else
-static inline uint64_t git_bswap64(uint64_t x)
-{
- union { uint64_t i64; uint32_t i32[2]; } tmp, result;
- if (__builtin_constant_p(x))
- result.i64 = default_bswap64(x);
- else {
- tmp.i64 = x;
- result.i32[0] = git_bswap32(tmp.i32[1]);
- result.i32[1] = git_bswap32(tmp.i32[0]);
- }
- return result.i64;
-}
-#endif
-
-#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_ARM64))
+#if defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_ARM64))
#include <stdlib.h>
--
2.49.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 5/6] bswap: Remove optimized x86 version of bswap32/64
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
` (4 preceding siblings ...)
2025-06-11 22:14 ` [PATCH v2 5/6] bswap.h: Remove optimized x86 version of bswap32/64 Sebastian Andrzej Siewior
@ 2025-06-11 22:14 ` Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 6/6] bswap.h: Provide a built-in based version of bswap32/64 if possible Sebastian Andrzej Siewior
` (2 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
On x86 the bswap32/64 macro is implemenated based on the x86 opcode wich
performs the required shifting in just one opcode.
The other CPUs fallback to the generic shifting as implemented by
default_swab32() and default_bswap64() if needed.
I've been looking at how good a compiler is at recognizing the default
shift and emitting an optimized operation:
- x86, arm64 msvc v19.20
default_swab32() optimized
default_bswap64() shifts
_byteswap_uint64() otimized
- x86, arm64 msvc v19.37
default_swab32() optimized
default_bswap64() optimized
_byteswap_uint64() otimized
- arm64, gcc-4.9.4: optimized
- x86-64, gcc-4.4.7: shifts
- x86-64, gcc-4.5.3: optimized
- x86-64, clang-3.0: optimized
Given that gcc-4.5 and clang-3.0 are fairly old, any recent compiler
should recognize the shift.
Remove the optimized x86 version and rely on the compiler.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
compat/bswap.h | 41 +----------------------------------------
1 file changed, 1 insertion(+), 40 deletions(-)
diff --git a/compat/bswap.h b/compat/bswap.h
index aeef304f671f5..ed00f6d1d53f3 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -35,46 +35,7 @@ static inline uint64_t default_bswap64(uint64_t val)
#undef bswap32
#undef bswap64
-#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
-
-#define bswap32 git_bswap32
-static inline uint32_t git_bswap32(uint32_t x)
-{
- uint32_t result;
- if (__builtin_constant_p(x))
- result = default_swab32(x);
- else
- __asm__("bswap %0" : "=r" (result) : "0" (x));
- return result;
-}
-
-#define bswap64 git_bswap64
-#if defined(__x86_64__)
-static inline uint64_t git_bswap64(uint64_t x)
-{
- uint64_t result;
- if (__builtin_constant_p(x))
- result = default_bswap64(x);
- else
- __asm__("bswap %q0" : "=r" (result) : "0" (x));
- return result;
-}
-#else
-static inline uint64_t git_bswap64(uint64_t x)
-{
- union { uint64_t i64; uint32_t i32[2]; } tmp, result;
- if (__builtin_constant_p(x))
- result.i64 = default_bswap64(x);
- else {
- tmp.i64 = x;
- result.i32[0] = git_bswap32(tmp.i32[1]);
- result.i32[1] = git_bswap32(tmp.i32[0]);
- }
- return result.i64;
-}
-#endif
-
-#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_ARM64))
+#if defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_ARM64))
#include <stdlib.h>
--
2.49.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 6/6] bswap.h: Provide a built-in based version of bswap32/64 if possible
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
` (5 preceding siblings ...)
2025-06-11 22:14 ` [PATCH v2 5/6] bswap: " Sebastian Andrzej Siewior
@ 2025-06-11 22:14 ` Sebastian Andrzej Siewior
2025-06-12 20:21 ` [PATCH v2 0/6] bswap.h: Rework ntohl handling Junio C Hamano
2025-07-07 22:43 ` Junio C Hamano
8 siblings, 0 replies; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-11 22:14 UTC (permalink / raw)
To: git; +Cc: Sebastian Andrzej Siewior
The compiler is in general able to recognize the endian shift and
replace it with an optimized opcode if possible. On certain
architectures such as RiscV or MIPS the situation can get complicated.
They don't provide an optimized opcode and masking the "higher" bits may
required loading a constant which needs shifting. This causes the
compiler to emit a lot of instructions for the operation.
The provided builtin directive on these architecture calls a function
which does the operation instead of emitting the code for operation.
Bring back the change from commit 6547d1c9 (bswap.h: add support for
built-in bswap functions, 2025-04-23). The bswap32/64 macro can now be
defined unconditionally so it won't regress on big endian architectures.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
compat/bswap.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/compat/bswap.h b/compat/bswap.h
index ed00f6d1d53f3..28635ebc690e3 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -32,6 +32,14 @@ static inline uint64_t default_bswap64(uint64_t val)
((val & (uint64_t)0xff00000000000000ULL) >> 56));
}
+/*
+ * __has_builtin is available since Clang 10 and GCC 10.
+ * Below is a fallback for older compilers.
+ */
+#ifndef __has_builtin
+# define __has_builtin(x) 0
+#endif
+
#undef bswap32
#undef bswap64
@@ -46,6 +54,11 @@ static inline uint64_t default_bswap64(uint64_t val)
#define GIT_BIG_ENDIAN 4321
#define GIT_BYTE_ORDER GIT_LITTLE_ENDIAN
+#elif __has_builtin(__builtin_bswap32) && __has_builtin(__builtin_bswap64)
+
+#define bswap32(x) __builtin_bswap32((x))
+#define bswap64(x) __builtin_bswap64((x))
+
#endif
#if defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && defined(__BIG_ENDIAN)
--
2.49.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/6] bswap.h: Always overwrite ntohl/ ntohll macros
2025-06-11 22:14 ` [PATCH v2 4/6] bswap.h: Always overwrite ntohl/ ntohll macros Sebastian Andrzej Siewior
@ 2025-06-12 0:14 ` brian m. carlson
2025-06-12 20:16 ` Junio C Hamano
0 siblings, 1 reply; 20+ messages in thread
From: brian m. carlson @ 2025-06-12 0:14 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 778 bytes --]
On 2025-06-11 at 22:14:40, Sebastian Andrzej Siewior wrote:
> The ntohl and htonl macros are redefined because the provided macros were
> not always optimal. Sometimes it was a function call, sometimes it was a
> macro which did the shifting. Using the 'bswap' opcode on x86 provides
> probably better performance than performing the shifting.
I believe that the peephole optimizer will almost always optimize them
to the bswap or equivalent opcode, much like it recognizes how to
generate rotate opcodes from two shifts and an or, so they should
actually be equivalent.
GCC and clang both emit simple bswap instructions with `-O2`, which is
the optimization level we use: https://godbolt.org/z/1r8P1Pqo7.
--
brian m. carlson (they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__
2025-06-11 22:14 ` [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__ Sebastian Andrzej Siewior
@ 2025-06-12 0:24 ` brian m. carlson
2025-06-12 0:43 ` Collin Funk
2025-06-12 20:10 ` Junio C Hamano
1 sibling, 1 reply; 20+ messages in thread
From: brian m. carlson @ 2025-06-12 0:24 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1696 bytes --]
On 2025-06-11 at 22:14:38, Sebastian Andrzej Siewior wrote:
> The __BYTE_ORDER__ define is provided by gcc (since ~v4.6), clang
> (since ~v3.2) and icc (since ~16.0.3). It is not provided by msvc as of
> v19.43 / 17.13.6.
> The __BYTE_ORDER and BYTE_ORDER macros are libc specific and are not
> available on all supported platforms such as mingw.
>
> Add support for the __BYTE_ORDER__ macro as a fallback.
>
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
> compat/bswap.h | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/compat/bswap.h b/compat/bswap.h
> index b34054f2bd728..0a457542dd76a 100644
> --- a/compat/bswap.h
> +++ b/compat/bswap.h
> @@ -116,6 +116,12 @@ static inline uint64_t git_bswap64(uint64_t x)
> # define GIT_LITTLE_ENDIAN LITTLE_ENDIAN
> # define GIT_BIG_ENDIAN BIG_ENDIAN
>
> +#elif defined(__BYTE_ORDER__) && defined(__ORDER_LITTLE_ENDIAN__) && defined(__ORDER_BIG_ENDIAN__)
> +
> +# define GIT_BYTE_ORDER __BYTE_ORDER__
> +# define GIT_LITTLE_ENDIAN __ORDER_LITTLE_ENDIAN__
> +# define GIT_BIG_ENDIAN __ORDER_BIG_ENDIAN__
> +
One additional option you can add if you want (it's completely optional)
is that if `__STDC_VERSION__` is 202311L or larger, then you can
`#include <stdbit.h>`, which has `__STDC_ENDIAN_LITTLE__`,
`__STDC_ENDIAN_BIG__`, and `__STDC_ENDIAN_NATIVE__`.
That will work on a modern GCC or clang with an appropriate compiler
flag. I don't know about MSVC, but it might be an option for people who
want to use more esoteric compilers which support standards but aren't
very compatible with GCC and clang.
--
brian m. carlson (they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__
2025-06-12 0:24 ` brian m. carlson
@ 2025-06-12 0:43 ` Collin Funk
0 siblings, 0 replies; 20+ messages in thread
From: Collin Funk @ 2025-06-12 0:43 UTC (permalink / raw)
To: brian m. carlson; +Cc: Sebastian Andrzej Siewior, git
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> One additional option you can add if you want (it's completely optional)
> is that if `__STDC_VERSION__` is 202311L or larger, then you can
> `#include <stdbit.h>`, which has `__STDC_ENDIAN_LITTLE__`,
> `__STDC_ENDIAN_BIG__`, and `__STDC_ENDIAN_NATIVE__`.
>
> That will work on a modern GCC or clang with an appropriate compiler
> flag. I don't know about MSVC, but it might be an option for people who
> want to use more esoteric compilers which support standards but aren't
> very compatible with GCC and clang.
Another standard that is too new to be common, but POSIX.1-2024 added
<endian.h> with BYTE_ORDER, LITTLE_ENDIAN, and BIG_ENDIAN [1].
In practice, some systems will have it because glibc had it for a long
time.
Collin
[1] https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/endian.h.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/6] Revert "bswap.h: add support for built-in bswap functions"
2025-06-11 22:14 ` [PATCH v2 1/6] Revert "bswap.h: add support for built-in bswap functions" Sebastian Andrzej Siewior
@ 2025-06-12 20:04 ` Junio C Hamano
0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2025-06-12 20:04 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: git
Sebastian Andrzej Siewior <sebastian@breakpoint.cc> writes:
> Since 6547d1c9 (bswap.h: add support for built-in bswap
> functions, 2025-04-23) tweaked the way the bswap32/64 macros are
> defined, on platforms with __builtin_bswap32/64 supported, the
> bswap32/64 macros are defined even on big endian platforms.
>
> However this file assumes that bswap31/64 are defined ONLY on
31/64 -> 32/64? Just sanity checking the typofix I plan to make
locally while applying this patch.
> little endian machines and uses that assumption to redefine
> ntohl/ntohll macros. The said commit broke t4014-format-patch.sh test,
> among many others on s390x.
>
> Revert the commit.
>
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
> compat/bswap.h | 14 +-------------
> 1 file changed, 1 insertion(+), 13 deletions(-)
>
> diff --git a/compat/bswap.h b/compat/bswap.h
> index 9e0f98e00b93a..b34054f2bd728 100644
> --- a/compat/bswap.h
> +++ b/compat/bswap.h
> @@ -35,19 +35,7 @@ static inline uint64_t default_bswap64(uint64_t val)
> #undef bswap32
> #undef bswap64
>
> -/**
> - * __has_builtin is available since Clang 10 and GCC 10.
> - * Below is a fallback for older compilers.
> - */
> -#ifndef __has_builtin
> - #define __has_builtin(x) 0
> -#endif
> -
> -#if __has_builtin(__builtin_bswap32) && __has_builtin(__builtin_bswap64)
> -#define bswap32(x) __builtin_bswap32((x))
> -#define bswap64(x) __builtin_bswap64((x))
> -
> -#elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
> +#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
>
> #define bswap32 git_bswap32
> static inline uint32_t git_bswap32(uint32_t x)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__
2025-06-11 22:14 ` [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__ Sebastian Andrzej Siewior
2025-06-12 0:24 ` brian m. carlson
@ 2025-06-12 20:10 ` Junio C Hamano
1 sibling, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2025-06-12 20:10 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: git
Sebastian Andrzej Siewior <sebastian@breakpoint.cc> writes:
> The __BYTE_ORDER__ define is provided by gcc (since ~v4.6), clang
> (since ~v3.2) and icc (since ~16.0.3). It is not provided by msvc as of
> v19.43 / 17.13.6.
I think it is more confusing than illuninating to have the last
sentence in the above paragraph. There probably are tons of other
compilers that do not support it, and this patch does not target any
of them, including MSVC, which is the topic of the next step anyway.
So, I'm inclined to just remove the sentence, or replace it with
something like "Even recent versions of MSVC do not support it,
which will be dealt with in the next patch."
> The __BYTE_ORDER and BYTE_ORDER macros are libc specific and are not
> available on all supported platforms such as mingw.
>
> Add support for the __BYTE_ORDER__ macro as a fallback.
>
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
> compat/bswap.h | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/compat/bswap.h b/compat/bswap.h
> index b34054f2bd728..0a457542dd76a 100644
> --- a/compat/bswap.h
> +++ b/compat/bswap.h
> @@ -116,6 +116,12 @@ static inline uint64_t git_bswap64(uint64_t x)
> # define GIT_LITTLE_ENDIAN LITTLE_ENDIAN
> # define GIT_BIG_ENDIAN BIG_ENDIAN
>
> +#elif defined(__BYTE_ORDER__) && defined(__ORDER_LITTLE_ENDIAN__) && defined(__ORDER_BIG_ENDIAN__)
> +
> +# define GIT_BYTE_ORDER __BYTE_ORDER__
> +# define GIT_LITTLE_ENDIAN __ORDER_LITTLE_ENDIAN__
> +# define GIT_BIG_ENDIAN __ORDER_BIG_ENDIAN__
> +
> #else
>
> # define GIT_BIG_ENDIAN 4321
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/6] bswap.h: Define GIT_LITTLE_ENDIAN on msvc as little endian
2025-06-11 22:14 ` [PATCH v2 3/6] bswap.h: Define GIT_LITTLE_ENDIAN on msvc as little endian Sebastian Andrzej Siewior
@ 2025-06-12 20:11 ` Junio C Hamano
0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2025-06-12 20:11 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: git
Sebastian Andrzej Siewior <sebastian@breakpoint.cc> writes:
> The Microsoft Visual C++ (MSVC) compiler (as of Visual Studio 2022
> version 17.13.6) does not define __BYTE_ORDER__ and its C-library does
> not define __BYTE_ORDER. The compiler is supported only an arm64 and x86
> which are all little endian.
"an arm" -> "on arm"? Just sanity checking before making local
typofix while queueing.
> Define GIT_BYTE_ORDER on msvc as little endian to avoid further checks.
>
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> ---
> compat/bswap.h | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/compat/bswap.h b/compat/bswap.h
> index 0a457542dd76a..fd604d9f7b74b 100644
> --- a/compat/bswap.h
> +++ b/compat/bswap.h
> @@ -81,6 +81,10 @@ static inline uint64_t git_bswap64(uint64_t x)
> #define bswap32(x) _byteswap_ulong(x)
> #define bswap64(x) _byteswap_uint64(x)
>
> +#define GIT_LITTLE_ENDIAN 1234
> +#define GIT_BIG_ENDIAN 4321
> +#define GIT_BYTE_ORDER GIT_LITTLE_ENDIAN
> +
> #endif
>
> #if defined(bswap32)
> @@ -122,7 +126,7 @@ static inline uint64_t git_bswap64(uint64_t x)
> # define GIT_LITTLE_ENDIAN __ORDER_LITTLE_ENDIAN__
> # define GIT_BIG_ENDIAN __ORDER_BIG_ENDIAN__
>
> -#else
> +#elif !defined(GIT_BYTE_ORDER)
>
> # define GIT_BIG_ENDIAN 4321
> # define GIT_LITTLE_ENDIAN 1234
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/6] bswap.h: Always overwrite ntohl/ ntohll macros
2025-06-12 0:14 ` brian m. carlson
@ 2025-06-12 20:16 ` Junio C Hamano
0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2025-06-12 20:16 UTC (permalink / raw)
To: brian m. carlson; +Cc: Sebastian Andrzej Siewior, git
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> On 2025-06-11 at 22:14:40, Sebastian Andrzej Siewior wrote:
>> The ntohl and htonl macros are redefined because the provided macros were
>> not always optimal. Sometimes it was a function call, sometimes it was a
>> macro which did the shifting. Using the 'bswap' opcode on x86 provides
>> probably better performance than performing the shifting.
>
> I believe that the peephole optimizer will almost always optimize them
> to the bswap or equivalent opcode, much like it recognizes how to
> generate rotate opcodes from two shifts and an or, so they should
> actually be equivalent.
>
> GCC and clang both emit simple bswap instructions with `-O2`, which is
> the optimization level we use: https://godbolt.org/z/1r8P1Pqo7.
Good observation. In short, we do not have to redefine these in
terms of bswap32/64 for performance as the compilers should do a
reasonable job.
The updated organization to separate the two concerns in this file,
namely, (1) figure out the best way to write bswap32/64, and (2)
override (or supply on platforms that do not offer) host-network
byte order helpers, does clean things up a lot, so I still like to
see it, though.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/6] bswap.h: Rework ntohl handling
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
` (6 preceding siblings ...)
2025-06-11 22:14 ` [PATCH v2 6/6] bswap.h: Provide a built-in based version of bswap32/64 if possible Sebastian Andrzej Siewior
@ 2025-06-12 20:21 ` Junio C Hamano
2025-07-07 22:43 ` Junio C Hamano
8 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2025-06-12 20:21 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: git
Sebastian Andrzej Siewior <sebastian@breakpoint.cc> writes:
> since the add of bswap32/64() support based on __builtin support, the
> usage of ntohl() macros is broken on big endian machines because the
> macros are always overwritten providing a swap.
>
> The first patch in the series reverts the change and the following
> patches try to improve the situation by allowing to always provide an
> optimized version.
I am inclined to take only this one for the upcoming release, as a
pure regression fix. It's not like with the change reverted, those
platforms with __builtin_bswap32/64 will stop building correctly.
The worst is that they fall back to the implementation they have
always been using until any and all released versions of Git.
The rest looked promising and with very good materials, but let's
leave them for the cycle after 2.50 ships.
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 5/6] bswap.h: Remove optimized x86 version of bswap32/64
2025-06-11 22:14 ` [PATCH v2 5/6] bswap.h: Remove optimized x86 version of bswap32/64 Sebastian Andrzej Siewior
@ 2025-06-26 15:55 ` Kristoffer Haugsbakk
2025-07-15 19:02 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 20+ messages in thread
From: Kristoffer Haugsbakk @ 2025-06-26 15:55 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, git
On Thu, Jun 12, 2025, at 00:14, Sebastian Andrzej Siewior wrote:
> On x86 the bswap32/64 macro is implemenated based on the x86 opcode wich
s/implemenated/implemented/
s/wich/which/
> performs the required shifting in just one opcode.
> The other CPUs fallback to the generic shifting as implemented by
> default_swab32() and default_bswap64() if needed.
>
> I've been looking at how good a compiler is at recognizing the default
> shift and emitting an optimized operation:
> - x86, arm64 msvc v19.20
> default_swab32() optimized
> default_bswap64() shifts
> _byteswap_uint64() otimized
>
> - x86, arm64 msvc v19.37
> default_swab32() optimized
> default_bswap64() optimized
> _byteswap_uint64() otimized
Is it supposed to say `otimized` twice?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/6] bswap.h: Rework ntohl handling
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
` (7 preceding siblings ...)
2025-06-12 20:21 ` [PATCH v2 0/6] bswap.h: Rework ntohl handling Junio C Hamano
@ 2025-07-07 22:43 ` Junio C Hamano
2025-07-09 6:04 ` Sebastian Andrzej Siewior
8 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2025-07-07 22:43 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: git
Sebastian Andrzej Siewior <sebastian@breakpoint.cc> writes:
> since the add of bswap32/64() support based on __builtin support, the
> usage of ntohl() macros is broken on big endian machines because the
> macros are always overwritten providing a swap.
>
> The first patch in the series reverts the change and the following
> patches try to improve the situation by allowing to always provide an
> optimized version.
>
> I've been looking at recent compiler and they manage to recognize the
> manual shifting and use an optimize opcode if available. The ntohl
> version provided by glibc already provides an "optimized" version which
> makes an optimisation in git almost not needed.
> One of the motivation behind overwriting/ providing an optimized
> version was to provide a macro instead of using a function call. One
> libc that is still providing ntohl as a function call is musl. So it
> might makes sense to keep it.
> While ntohl() is provided by the libc, the ntohll() is not. I found it
> only on Windows provided by winsock.h.
>
> I haven't touched the put/get_be*() macros. gcc & clang are both smart
> enough to swap the content accordingly and perform a single store/ load.
> Only the msvc seems to strugle here and performs multiple bytes stores/
> loads and shifts.
Now, I see many comments and suggestions on the thread since this v2
iteration was posted:
https://lore.kernel.org/git/20250611221444.1567638-1-sebastian@breakpoint.cc/
and I think the ball is in the author's court to respond.
I'll mark the topic as Stalled in the draft of the next issue of
"What's cooking" report.
Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/6] bswap.h: Rework ntohl handling
2025-07-07 22:43 ` Junio C Hamano
@ 2025-07-09 6:04 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-07-09 6:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On 2025-07-07 15:43:20 [-0700], Junio C Hamano wrote:
>
> Now, I see many comments and suggestions on the thread since this v2
> iteration was posted:
>
> https://lore.kernel.org/git/20250611221444.1567638-1-sebastian@breakpoint.cc/
>
> and I think the ball is in the author's court to respond.
>
> I'll mark the topic as Stalled in the draft of the next issue of
> "What's cooking" report.
Sorry for the delay, thank you for the reminder.
> Thanks.
Sebastian
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 5/6] bswap.h: Remove optimized x86 version of bswap32/64
2025-06-26 15:55 ` Kristoffer Haugsbakk
@ 2025-07-15 19:02 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 20+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-07-15 19:02 UTC (permalink / raw)
To: Kristoffer Haugsbakk; +Cc: git
On 2025-06-26 17:55:43 [+0200], Kristoffer Haugsbakk wrote:
> On Thu, Jun 12, 2025, at 00:14, Sebastian Andrzej Siewior wrote:
> > On x86 the bswap32/64 macro is implemenated based on the x86 opcode wich
>
> s/implemenated/implemented/
>
> s/wich/which/
>
> > performs the required shifting in just one opcode.
> > The other CPUs fallback to the generic shifting as implemented by
> > default_swab32() and default_bswap64() if needed.
> >
> > I've been looking at how good a compiler is at recognizing the default
> > shift and emitting an optimized operation:
> > - x86, arm64 msvc v19.20
> > default_swab32() optimized
> > default_bswap64() shifts
> > _byteswap_uint64() otimized
> >
> > - x86, arm64 msvc v19.37
> > default_swab32() optimized
> > default_bswap64() optimized
> > _byteswap_uint64() otimized
>
> Is it supposed to say `otimized` twice?
Nope. Fixed. Thank you.
Sebastian
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-07-15 19:02 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-11 22:14 [PATCH v2 0/6] bswap.h: Rework ntohl handling Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 1/6] Revert "bswap.h: add support for built-in bswap functions" Sebastian Andrzej Siewior
2025-06-12 20:04 ` Junio C Hamano
2025-06-11 22:14 ` [PATCH v2 2/6] bswap.h: Add support for __BYTE_ORDER__ Sebastian Andrzej Siewior
2025-06-12 0:24 ` brian m. carlson
2025-06-12 0:43 ` Collin Funk
2025-06-12 20:10 ` Junio C Hamano
2025-06-11 22:14 ` [PATCH v2 3/6] bswap.h: Define GIT_LITTLE_ENDIAN on msvc as little endian Sebastian Andrzej Siewior
2025-06-12 20:11 ` Junio C Hamano
2025-06-11 22:14 ` [PATCH v2 4/6] bswap.h: Always overwrite ntohl/ ntohll macros Sebastian Andrzej Siewior
2025-06-12 0:14 ` brian m. carlson
2025-06-12 20:16 ` Junio C Hamano
2025-06-11 22:14 ` [PATCH v2 5/6] bswap.h: Remove optimized x86 version of bswap32/64 Sebastian Andrzej Siewior
2025-06-26 15:55 ` Kristoffer Haugsbakk
2025-07-15 19:02 ` Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 5/6] bswap: " Sebastian Andrzej Siewior
2025-06-11 22:14 ` [PATCH v2 6/6] bswap.h: Provide a built-in based version of bswap32/64 if possible Sebastian Andrzej Siewior
2025-06-12 20:21 ` [PATCH v2 0/6] bswap.h: Rework ntohl handling Junio C Hamano
2025-07-07 22:43 ` Junio C Hamano
2025-07-09 6:04 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).