From: Rostislav Krasny <rostiprodev@gmail.com>
To: git@vger.kernel.org
Cc: Rostislav Krasny <rostiprodev@gmail.com>
Subject: [PATCH 0/1] compat: modernize and simplify byte swapping functions
Date: Fri, 2 Jan 2026 02:27:34 +0200 [thread overview]
Message-ID: <20260102002735.31390-1-rostiprodev@gmail.com> (raw)
When I read sha256/block/sha256.c I noticed it uses both the htonl macro and
the get_be32() static inline function. I was surprised how different the
implementations of those two kindred things are. When GCC or Clang is used the
htonl macro is translated into the __builtin_bswap32() call, which is assembled
into one single CPU instruction, in the case of x86. And the original
implementation of the get_be32() function used eight bitwise operations. Even
if the compiler can optimize that code it's still less readable and more error
prone.
The main reason it was implemented so complicated is UB when conversion of a
pointer to one object type into a pointer of a different object type is used.
On the other hand, memcpy is protected from such UB and this allows us to make
that code simpler and even more optimal, in some cases.
Additionally I made a few more small improvements related to the same
functionality.
I've measured performance of the original and the new code on my Intel
Xeon W-2135 based computer in Fedora 43 Linux with:
* glibc 2.42-5.fc43
* gcc 15.2.1-5.fc43
* clang 21.1.7-1.fc43
I used the following code for these measurements:
#include <inttypes.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <time.h>
#include "bswap.h"
#define ITERATIONS 1000000
#define BUF_SIZE 8192
int main() {
uint8_t buffer[BUF_SIZE];
uint64_t sum = 0;
for (int i = 0; i < BUF_SIZE; i++) {
buffer[i] = (uint8_t)i;
}
clock_t start = clock();
for (int i = 0; i < ITERATIONS; i++) {
// use a volatile pointer to force the compiler to read memory
volatile uint8_t *p = buffer;
for (int j = 0; j < BUF_SIZE - 8; j++) {
sum += get_be64((const void*)(p + j));
}
}
clock_t end = clock();
double time_taken = (double)(end - start) / CLOCKS_PER_SEC;
printf("Time taken: %f seconds\n", time_taken);
printf("Checksum: %" PRIu64 "\n", sum);
return 0;
}
And these are the results:
GCC 15.2.1
version | -Os | -O0 | -O1 | -O2 | -O3
================================================================
| 3.721806 |72.342204 |11.956021 | 3.119833 | 0.919873
original| 3.726111 |72.326920 |11.963618 | 3.128222 | 0.921128
| 3.719791 |72.328175 |11.949108 | 3.130956 | 0.920296
================================================================
| 3.719899 |17.177719 | 3.005065 | 3.120747 | 0.920609
new | 3.714785 |17.168950 | 3.004978 | 3.119227 | 0.918851
| 3.716782 |17.145386 | 3.009364 | 3.119573 | 0.920030
================================================================
Clang 21.1.7
version | -Os | -O0 | -O1 | -O2 | -O3
================================================================
| 3.690718 |62.916338 | 3.017460 | 3.768443 | 3.778840
original| 3.686283 |62.965916 | 3.014674 | 3.777897 | 3.774776
| 3.687775 |62.850648 | 3.003496 | 3.766108 | 3.765313
================================================================
| 3.681818 |16.753385 | 3.008131 | 2.075271 | 2.076090
new | 3.687184 |16.737982 | 3.004365 | 2.071597 | 2.074507
| 3.683960 |16.765067 | 2.999775 | 2.075354 | 2.075759
================================================================
Rostislav Krasny (1):
compat: modernize and simplify byte swapping functions
compat/bswap.h | 74 ++++++++++++++++++++++++++++++--------------------
1 file changed, 44 insertions(+), 30 deletions(-)
--
2.52.0
next reply other threads:[~2026-01-02 0:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 0:27 Rostislav Krasny [this message]
2026-01-02 0:27 ` [PATCH 1/1] compat: modernize and simplify byte swapping functions Rostislav Krasny
2026-01-02 6:16 ` Jeff King
2026-01-02 17:37 ` Rostislav Krasny
2026-01-11 22:05 ` Rostislav Krasny
2026-01-14 21:14 ` Jeff King
2026-01-02 7:29 ` [PATCH 0/1] " Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260102002735.31390-1-rostiprodev@gmail.com \
--to=rostiprodev@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox