* [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source
@ 2025-01-03 7:30 Masahiro Yamada
2025-01-03 7:30 ` [PATCH 2/6] genksyms: fix memory leak when the same symbol is read from *.symref file Masahiro Yamada
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Masahiro Yamada @ 2025-01-03 7:30 UTC (permalink / raw)
To: linux-kbuild
Cc: linux-kernel, Masahiro Yamada, Andreas Gruenbacher, Andrew Morton,
Sam Ravnborg
When a symbol that is already registered is added again, __add_symbol()
returns without freeing the symbol definition, making it unreachable.
The following test cases demonstrate different memory leak points.
[Test Case 1]
Forward declaration with exactly the same definition
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) {}
EXPORT_SYMBOL(foo);
[Test Case 2]
Forward declaration with a different definition (e.g. attribute)
$ cat foo.c
#include <linux/export.h>
void foo(void);
__attribute__((__section__(".ref.text"))) void foo(void) {}
EXPORT_SYMBOL(foo);
[Test Case 3]
Preserving an overridden symbol (compile with KBUILD_PRESERVE=1)
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) { }
EXPORT_SYMBOL(foo);
$ cat foo.symref
override foo void foo ( int )
The memory leaks in Test Case 1 and 2 have existed since the introduction
of genksyms into the kernel tree. [1]
The memory leak in Test Case 3 was introduced by commit 5dae9a550a74
("genksyms: allow to ignore symbol checksum changes").
When multiple init_declarators are reduced to an init_declarator_list,
the decl_spec must be duplicated. Otherwise, the following Test Case 4
would result in a double-free bug.
[Test Case 4]
$ cat foo.c
#include <linux/export.h>
extern int foo, bar;
int foo, bar;
EXPORT_SYMBOL(foo);
In this case, 'foo' and 'bar' share the same decl_spec, 'int'. It must
be unshared before being passed to add_symbol().
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=46bd1da672d66ccd8a639d3c1f8a166048cca608
Fixes: 5dae9a550a74 ("genksyms: allow to ignore symbol checksum changes")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/genksyms.c | 3 +++
scripts/genksyms/parse.y | 14 ++++++++++++--
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index 07f9b8cfb233..8ca46f807b57 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -239,6 +239,7 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
"unchanged\n");
}
sym->is_declared = 1;
+ free_list(defn, NULL);
return sym;
} else if (!sym->is_declared) {
if (sym->is_override && flag_preserve) {
@@ -247,6 +248,7 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
print_type_name(type, name);
fprintf(stderr, " modversion change\n");
sym->is_declared = 1;
+ free_list(defn, NULL);
return sym;
} else {
status = is_unknown_symbol(sym) ?
@@ -254,6 +256,7 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
}
} else {
error_with_pos("redefinition of %s", name);
+ free_list(defn, NULL);
return sym;
}
break;
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 8e9b5e69e8f0..840371d01bf4 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -152,14 +152,19 @@ simple_declaration:
;
init_declarator_list_opt:
- /* empty */ { $$ = NULL; }
- | init_declarator_list
+ /* empty */ { $$ = NULL; }
+ | init_declarator_list { free_list(decl_spec, NULL); $$ = $1; }
;
init_declarator_list:
init_declarator
{ struct string_list *decl = *$1;
*$1 = NULL;
+
+ /* avoid sharing among multiple init_declarators */
+ if (decl_spec)
+ decl_spec = copy_list_range(decl_spec, NULL);
+
add_symbol(current_name,
is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern);
current_name = NULL;
@@ -170,6 +175,11 @@ init_declarator_list:
*$3 = NULL;
free_list(*$2, NULL);
*$2 = decl_spec;
+
+ /* avoid sharing among multiple init_declarators */
+ if (decl_spec)
+ decl_spec = copy_list_range(decl_spec, NULL);
+
add_symbol(current_name,
is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern);
current_name = NULL;
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/6] genksyms: fix memory leak when the same symbol is read from *.symref file
2025-01-03 7:30 [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Masahiro Yamada
@ 2025-01-03 7:30 ` Masahiro Yamada
2025-01-03 12:50 ` Markus Elfring
2025-01-03 7:30 ` [PATCH 3/6] genksyms: reduce the indentation in the for-loop in __add_symbol() Masahiro Yamada
` (4 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: Masahiro Yamada @ 2025-01-03 7:30 UTC (permalink / raw)
To: linux-kbuild
Cc: linux-kernel, Masahiro Yamada, Andreas Gruenbacher, Andrew Morton,
Sam Ravnborg
When a symbol that is already registered is read again from *.symref
file, __add_symbol() removes the previous one from the hash table without
freeing it.
[Test Case]
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) {}
EXPORT_SYMBOL(foo);
$ cat foo.symref
foo void foo ( void )
foo void foo ( void )
When a symbol is removed from the hash table, it must be freed along
with its ->name and ->defn members. However, sym->name cannot be freed
because it is sometimes shared with node->string, but not always. If
sym->name and node->string share the same memory, free(sym->name) could
lead to a double-free bug.
To resolve this issue, always assign a strdup'ed string to sym->name.
Fixes: 64e6c1e12372 ("genksyms: track symbol checksum changes")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/genksyms.c | 8 ++++++--
scripts/genksyms/genksyms.h | 2 +-
scripts/genksyms/parse.y | 4 ++--
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index 8ca46f807b57..c5e8e0e0f949 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -272,11 +272,15 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
break;
}
}
+
+ free_list(sym->defn, NULL);
+ free(sym->name);
+ free(sym);
--nsyms;
}
sym = xmalloc(sizeof(*sym));
- sym->name = name;
+ sym->name = xstrdup(name);
sym->type = type;
sym->defn = defn;
sym->expansion_trail = NULL;
@@ -483,7 +487,7 @@ static void read_reference(FILE *f)
defn = def;
def = read_node(f);
}
- subsym = add_reference_symbol(xstrdup(sym->string), sym->tag,
+ subsym = add_reference_symbol(sym->string, sym->tag,
defn, is_extern);
subsym->is_override = is_override;
free_node(sym);
diff --git a/scripts/genksyms/genksyms.h b/scripts/genksyms/genksyms.h
index 21ed2ec2d98c..5621533dcb8e 100644
--- a/scripts/genksyms/genksyms.h
+++ b/scripts/genksyms/genksyms.h
@@ -32,7 +32,7 @@ struct string_list {
struct symbol {
struct symbol *hash_next;
- const char *name;
+ char *name;
enum symbol_type type;
struct string_list *defn;
struct symbol *expansion_trail;
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 840371d01bf4..689cb6bb40b6 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -482,12 +482,12 @@ enumerator_list:
enumerator:
IDENT
{
- const char *name = strdup((*$1)->string);
+ const char *name = (*$1)->string;
add_symbol(name, SYM_ENUM_CONST, NULL, 0);
}
| IDENT '=' EXPRESSION_PHRASE
{
- const char *name = strdup((*$1)->string);
+ const char *name = (*$1)->string;
struct string_list *expr = copy_list_range(*$3, *$2);
add_symbol(name, SYM_ENUM_CONST, expr, 0);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/6] genksyms: reduce the indentation in the for-loop in __add_symbol()
2025-01-03 7:30 [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Masahiro Yamada
2025-01-03 7:30 ` [PATCH 2/6] genksyms: fix memory leak when the same symbol is read from *.symref file Masahiro Yamada
@ 2025-01-03 7:30 ` Masahiro Yamada
2025-01-03 7:30 ` [PATCH 4/6] genksyms: refactor the return points " Masahiro Yamada
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Masahiro Yamada @ 2025-01-03 7:30 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
To improve readability, reduce the indentation as follows:
- Use 'continue' earlier when the symbol does not match
- flip !sym->is_declared to flatten the if-else chain
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/genksyms.c | 63 ++++++++++++++++++-------------------
1 file changed, 30 insertions(+), 33 deletions(-)
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index c5e8e0e0f949..5a90acd693f4 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -226,41 +226,38 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
h = crc32(name) % HASH_BUCKETS;
for (sym = symtab[h]; sym; sym = sym->hash_next) {
- if (map_to_ns(sym->type) == map_to_ns(type) &&
- strcmp(name, sym->name) == 0) {
- if (is_reference)
- /* fall through */ ;
- else if (sym->type == type &&
- equal_list(sym->defn, defn)) {
- if (!sym->is_declared && sym->is_override) {
- print_location();
- print_type_name(type, name);
- fprintf(stderr, " modversion is "
- "unchanged\n");
- }
- sym->is_declared = 1;
- free_list(defn, NULL);
- return sym;
- } else if (!sym->is_declared) {
- if (sym->is_override && flag_preserve) {
- print_location();
- fprintf(stderr, "ignoring ");
- print_type_name(type, name);
- fprintf(stderr, " modversion change\n");
- sym->is_declared = 1;
- free_list(defn, NULL);
- return sym;
- } else {
- status = is_unknown_symbol(sym) ?
- STATUS_DEFINED : STATUS_MODIFIED;
- }
- } else {
- error_with_pos("redefinition of %s", name);
- free_list(defn, NULL);
- return sym;
+ if (map_to_ns(sym->type) != map_to_ns(type) ||
+ strcmp(name, sym->name))
+ continue;
+
+ if (is_reference) {
+ /* fall through */ ;
+ } else if (sym->type == type && equal_list(sym->defn, defn)) {
+ if (!sym->is_declared && sym->is_override) {
+ print_location();
+ print_type_name(type, name);
+ fprintf(stderr, " modversion is unchanged\n");
}
- break;
+ sym->is_declared = 1;
+ free_list(defn, NULL);
+ return sym;
+ } else if (sym->is_declared) {
+ error_with_pos("redefinition of %s", name);
+ free_list(defn, NULL);
+ return sym;
+ } else if (sym->is_override && flag_preserve) {
+ print_location();
+ fprintf(stderr, "ignoring ");
+ print_type_name(type, name);
+ fprintf(stderr, " modversion change\n");
+ sym->is_declared = 1;
+ free_list(defn, NULL);
+ return sym;
+ } else {
+ status = is_unknown_symbol(sym) ?
+ STATUS_DEFINED : STATUS_MODIFIED;
}
+ break;
}
if (sym) {
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/6] genksyms: refactor the return points in the for-loop in __add_symbol()
2025-01-03 7:30 [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Masahiro Yamada
2025-01-03 7:30 ` [PATCH 2/6] genksyms: fix memory leak when the same symbol is read from *.symref file Masahiro Yamada
2025-01-03 7:30 ` [PATCH 3/6] genksyms: reduce the indentation in the for-loop in __add_symbol() Masahiro Yamada
@ 2025-01-03 7:30 ` Masahiro Yamada
2025-01-03 7:30 ` [PATCH 5/6] genksyms: use generic macros for hash table implementation Masahiro Yamada
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Masahiro Yamada @ 2025-01-03 7:30 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
free_list() must be called before returning from this for-loop.
Swap 'break' and the combination of free_list() and 'return'.
This reduces the code and minimizes the risk of introducing memory
leaks in future changes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/genksyms.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index 5a90acd693f4..41d6cfce0088 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -231,7 +231,7 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
continue;
if (is_reference) {
- /* fall through */ ;
+ break;
} else if (sym->type == type && equal_list(sym->defn, defn)) {
if (!sym->is_declared && sym->is_override) {
print_location();
@@ -239,25 +239,21 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
fprintf(stderr, " modversion is unchanged\n");
}
sym->is_declared = 1;
- free_list(defn, NULL);
- return sym;
} else if (sym->is_declared) {
error_with_pos("redefinition of %s", name);
- free_list(defn, NULL);
- return sym;
} else if (sym->is_override && flag_preserve) {
print_location();
fprintf(stderr, "ignoring ");
print_type_name(type, name);
fprintf(stderr, " modversion change\n");
sym->is_declared = 1;
- free_list(defn, NULL);
- return sym;
} else {
status = is_unknown_symbol(sym) ?
STATUS_DEFINED : STATUS_MODIFIED;
+ break;
}
- break;
+ free_list(defn, NULL);
+ return sym;
}
if (sym) {
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 5/6] genksyms: use generic macros for hash table implementation
2025-01-03 7:30 [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Masahiro Yamada
` (2 preceding siblings ...)
2025-01-03 7:30 ` [PATCH 4/6] genksyms: refactor the return points " Masahiro Yamada
@ 2025-01-03 7:30 ` Masahiro Yamada
2025-01-03 7:30 ` [PATCH 6/6] genksyms: use uint32_t instead of unsigned long for calculating CRC Masahiro Yamada
2025-01-03 11:56 ` [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Markus Elfring
5 siblings, 0 replies; 10+ messages in thread
From: Masahiro Yamada @ 2025-01-03 7:30 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
Use macros provided by hashtable.h
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/genksyms.c | 32 ++++++++++++--------------------
scripts/genksyms/genksyms.h | 4 +++-
2 files changed, 15 insertions(+), 21 deletions(-)
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index 41d6cfce0088..e2cd3dcb469f 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -18,12 +18,12 @@
#include <stdarg.h>
#include <getopt.h>
+#include <hashtable.h>
+
#include "genksyms.h"
/*----------------------------------------------------------------------*/
-#define HASH_BUCKETS 4096
-
-static struct symbol *symtab[HASH_BUCKETS];
+static HASHTABLE_DEFINE(symbol_hashtable, 1U << 12);
static FILE *debugfile;
int cur_line = 1;
@@ -151,14 +151,14 @@ static enum symbol_type map_to_ns(enum symbol_type t)
struct symbol *find_symbol(const char *name, enum symbol_type ns, int exact)
{
- unsigned long h = crc32(name) % HASH_BUCKETS;
struct symbol *sym;
- for (sym = symtab[h]; sym; sym = sym->hash_next)
+ hash_for_each_possible(symbol_hashtable, sym, hnode, crc32(name)) {
if (map_to_ns(sym->type) == map_to_ns(ns) &&
strcmp(name, sym->name) == 0 &&
sym->is_declared)
break;
+ }
if (exact && sym && sym->type != ns)
return NULL;
@@ -224,8 +224,8 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
return NULL;
}
- h = crc32(name) % HASH_BUCKETS;
- for (sym = symtab[h]; sym; sym = sym->hash_next) {
+ h = crc32(name);
+ hash_for_each_possible(symbol_hashtable, sym, hnode, h) {
if (map_to_ns(sym->type) != map_to_ns(type) ||
strcmp(name, sym->name))
continue;
@@ -257,14 +257,7 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
}
if (sym) {
- struct symbol **psym;
-
- for (psym = &symtab[h]; *psym; psym = &(*psym)->hash_next) {
- if (*psym == sym) {
- *psym = sym->hash_next;
- break;
- }
- }
+ hash_del(&sym->hnode);
free_list(sym->defn, NULL);
free(sym->name);
@@ -280,8 +273,7 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
sym->visited = NULL;
sym->is_extern = is_extern;
- sym->hash_next = symtab[h];
- symtab[h] = sym;
+ hash_add(symbol_hashtable, &sym->hnode, h);
sym->is_declared = !is_reference;
sym->status = status;
@@ -832,9 +824,9 @@ int main(int argc, char **argv)
}
if (flag_debug) {
- fprintf(debugfile, "Hash table occupancy %d/%d = %g\n",
- nsyms, HASH_BUCKETS,
- (double)nsyms / (double)HASH_BUCKETS);
+ fprintf(debugfile, "Hash table occupancy %d/%zd = %g\n",
+ nsyms, HASH_SIZE(symbol_hashtable),
+ (double)nsyms / HASH_SIZE(symbol_hashtable));
}
if (dumpfile)
diff --git a/scripts/genksyms/genksyms.h b/scripts/genksyms/genksyms.h
index 5621533dcb8e..8c45ada59ece 100644
--- a/scripts/genksyms/genksyms.h
+++ b/scripts/genksyms/genksyms.h
@@ -14,6 +14,8 @@
#include <stdio.h>
+#include <list_types.h>
+
enum symbol_type {
SYM_NORMAL, SYM_TYPEDEF, SYM_ENUM, SYM_STRUCT, SYM_UNION,
SYM_ENUM_CONST
@@ -31,7 +33,7 @@ struct string_list {
};
struct symbol {
- struct symbol *hash_next;
+ struct hlist_node hnode;
char *name;
enum symbol_type type;
struct string_list *defn;
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 6/6] genksyms: use uint32_t instead of unsigned long for calculating CRC
2025-01-03 7:30 [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Masahiro Yamada
` (3 preceding siblings ...)
2025-01-03 7:30 ` [PATCH 5/6] genksyms: use generic macros for hash table implementation Masahiro Yamada
@ 2025-01-03 7:30 ` Masahiro Yamada
2025-01-03 8:34 ` David Laight
2025-01-03 11:56 ` [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Markus Elfring
5 siblings, 1 reply; 10+ messages in thread
From: Masahiro Yamada @ 2025-01-03 7:30 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
Currently, 'unsigned long' is used for intermediate variables when
calculating CRCs.
The size of 'long' differs depending on the architecture: it is 32 bits
on 32-bit architectures and 64 bits on 64-bit architectures.
The CRC values generated by genksyms represent the compatibility of
exported symbols. Therefore, reproducibility is important. In other
words, we need to ensure that the output is the same when the kernel
source is identical, regardless of whether genksyms is running on a
32-bit or 64-bit build machine.
Fortunately, the output from genksyms is not affected by the build
machine's architecture because only the lower 32 bits of the
'unsigned long' variables are used.
To make it even clearer that the CRC calculation is independent of
the build machine's architecture, this commit explicitly uses the
fixed-width type, uint32_t.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/genksyms.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index e2cd3dcb469f..8b0d7ac73dbb 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -12,6 +12,7 @@
#include <stdio.h>
#include <string.h>
+#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>
@@ -60,7 +61,7 @@ static void print_type_name(enum symbol_type type, const char *name);
/*----------------------------------------------------------------------*/
-static const unsigned int crctab32[] = {
+static const uint32_t crctab32[] = {
0x00000000U, 0x77073096U, 0xee0e612cU, 0x990951baU, 0x076dc419U,
0x706af48fU, 0xe963a535U, 0x9e6495a3U, 0x0edb8832U, 0x79dcb8a4U,
0xe0d5e91eU, 0x97d2d988U, 0x09b64c2bU, 0x7eb17cbdU, 0xe7b82d07U,
@@ -115,19 +116,19 @@ static const unsigned int crctab32[] = {
0x2d02ef8dU
};
-static unsigned long partial_crc32_one(unsigned char c, unsigned long crc)
+static uint32_t partial_crc32_one(uint8_t c, uint32_t crc)
{
return crctab32[(crc ^ c) & 0xff] ^ (crc >> 8);
}
-static unsigned long partial_crc32(const char *s, unsigned long crc)
+static uint32_t partial_crc32(const char *s, uint32_t crc)
{
while (*s)
crc = partial_crc32_one(*s++, crc);
return crc;
}
-static unsigned long crc32(const char *s)
+static uint32_t crc32(const char *s)
{
return partial_crc32(s, 0xffffffff) ^ 0xffffffff;
}
@@ -517,7 +518,7 @@ static void print_list(FILE * f, struct string_list *list)
}
}
-static unsigned long expand_and_crc_sym(struct symbol *sym, unsigned long crc)
+static uint32_t expand_and_crc_sym(struct symbol *sym, uint32_t crc)
{
struct string_list *list = sym->defn;
struct string_list **e, **b;
@@ -624,7 +625,7 @@ static unsigned long expand_and_crc_sym(struct symbol *sym, unsigned long crc)
void export_symbol(const char *name)
{
struct symbol *sym;
- unsigned long crc;
+ uint32_t crc;
int has_changed = 0;
sym = find_symbol(name, SYM_NORMAL, 0);
@@ -672,7 +673,7 @@ void export_symbol(const char *name)
if (flag_dump_defs)
fputs(">\n", debugfile);
- printf("#SYMVER %s 0x%08lx\n", name, crc);
+ printf("#SYMVER %s 0x%08lx\n", name, (unsigned long)crc);
}
/*----------------------------------------------------------------------*/
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 6/6] genksyms: use uint32_t instead of unsigned long for calculating CRC
2025-01-03 7:30 ` [PATCH 6/6] genksyms: use uint32_t instead of unsigned long for calculating CRC Masahiro Yamada
@ 2025-01-03 8:34 ` David Laight
2025-01-10 17:34 ` Masahiro Yamada
0 siblings, 1 reply; 10+ messages in thread
From: David Laight @ 2025-01-03 8:34 UTC (permalink / raw)
To: Masahiro Yamada; +Cc: linux-kbuild, linux-kernel
On Fri, 3 Jan 2025 16:30:43 +0900
Masahiro Yamada <masahiroy@kernel.org> wrote:
> Currently, 'unsigned long' is used for intermediate variables when
> calculating CRCs.
>
> The size of 'long' differs depending on the architecture: it is 32 bits
> on 32-bit architectures and 64 bits on 64-bit architectures.
>
> The CRC values generated by genksyms represent the compatibility of
> exported symbols. Therefore, reproducibility is important. In other
> words, we need to ensure that the output is the same when the kernel
> source is identical, regardless of whether genksyms is running on a
> 32-bit or 64-bit build machine.
>
> Fortunately, the output from genksyms is not affected by the build
> machine's architecture because only the lower 32 bits of the
> 'unsigned long' variables are used.
>
> To make it even clearer that the CRC calculation is independent of
> the build machine's architecture, this commit explicitly uses the
> fixed-width type, uint32_t.
>
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> ---
>
> scripts/genksyms/genksyms.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
> index e2cd3dcb469f..8b0d7ac73dbb 100644
> --- a/scripts/genksyms/genksyms.c
> +++ b/scripts/genksyms/genksyms.c
>...
> - printf("#SYMVER %s 0x%08lx\n", name, crc);
> + printf("#SYMVER %s 0x%08lx\n", name, (unsigned long)crc);
That should use PRIu32, but the whole patch could just use 'unsigned int'.
No one is going to try to build this where 'int' is 16bit.
All the hex constants assume that int is 32bits as well.
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source
2025-01-03 7:30 [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Masahiro Yamada
` (4 preceding siblings ...)
2025-01-03 7:30 ` [PATCH 6/6] genksyms: use uint32_t instead of unsigned long for calculating CRC Masahiro Yamada
@ 2025-01-03 11:56 ` Markus Elfring
5 siblings, 0 replies; 10+ messages in thread
From: Markus Elfring @ 2025-01-03 11:56 UTC (permalink / raw)
To: Masahiro Yamada, linux-kbuild
Cc: LKML, Andreas Grünbacher, Andrew Morton, Sam Ravnborg
…
> +++ b/scripts/genksyms/genksyms.c
> @@ -239,6 +239,7 @@ static struct symbol *__add_symbol(const char *name, enum symbol_type type,
> "unchanged\n");
> }
> sym->is_declared = 1;
> + free_list(defn, NULL);
> return sym;
> } else if (!sym->is_declared) {
> if (sym->is_override && flag_preserve) {
…
Would you like to complete the data processing by using a corresponding goto chain?
Regards,
Markus
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/6] genksyms: fix memory leak when the same symbol is read from *.symref file
2025-01-03 7:30 ` [PATCH 2/6] genksyms: fix memory leak when the same symbol is read from *.symref file Masahiro Yamada
@ 2025-01-03 12:50 ` Markus Elfring
0 siblings, 0 replies; 10+ messages in thread
From: Markus Elfring @ 2025-01-03 12:50 UTC (permalink / raw)
To: Masahiro Yamada, linux-kbuild
Cc: LKML, Andreas Grünbacher, Andrew Morton, Sam Ravnborg
…
> +++ b/scripts/genksyms/parse.y
> @@ -482,12 +482,12 @@ enumerator_list:
> enumerator:
> IDENT
> {
> - const char *name = strdup((*$1)->string);
> + const char *name = (*$1)->string;
> add_symbol(name, SYM_ENUM_CONST, NULL, 0);
> }
…
I would find it safer and cleaner to separate such adjustments into another update step.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.13-rc5#n81
Will code transformation concerns be reconsidered any more?
Regards,
Markus
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 6/6] genksyms: use uint32_t instead of unsigned long for calculating CRC
2025-01-03 8:34 ` David Laight
@ 2025-01-10 17:34 ` Masahiro Yamada
0 siblings, 0 replies; 10+ messages in thread
From: Masahiro Yamada @ 2025-01-10 17:34 UTC (permalink / raw)
To: David Laight; +Cc: linux-kbuild, linux-kernel
On Fri, Jan 3, 2025 at 5:34 PM David Laight
<david.laight.linux@gmail.com> wrote:
>
> On Fri, 3 Jan 2025 16:30:43 +0900
> Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> > Currently, 'unsigned long' is used for intermediate variables when
> > calculating CRCs.
> >
> > The size of 'long' differs depending on the architecture: it is 32 bits
> > on 32-bit architectures and 64 bits on 64-bit architectures.
> >
> > The CRC values generated by genksyms represent the compatibility of
> > exported symbols. Therefore, reproducibility is important. In other
> > words, we need to ensure that the output is the same when the kernel
> > source is identical, regardless of whether genksyms is running on a
> > 32-bit or 64-bit build machine.
> >
> > Fortunately, the output from genksyms is not affected by the build
> > machine's architecture because only the lower 32 bits of the
> > 'unsigned long' variables are used.
> >
> > To make it even clearer that the CRC calculation is independent of
> > the build machine's architecture, this commit explicitly uses the
> > fixed-width type, uint32_t.
> >
> > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> > ---
> >
> > scripts/genksyms/genksyms.c | 15 ++++++++-------
> > 1 file changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
> > index e2cd3dcb469f..8b0d7ac73dbb 100644
> > --- a/scripts/genksyms/genksyms.c
> > +++ b/scripts/genksyms/genksyms.c
> >...
> > - printf("#SYMVER %s 0x%08lx\n", name, crc);
> > + printf("#SYMVER %s 0x%08lx\n", name, (unsigned long)crc);
>
> That should use PRIu32, but the whole patch could just use 'unsigned int'.
> No one is going to try to build this where 'int' is 16bit.
> All the hex constants assume that int is 32bits as well.
The point is, uint32_t is the clearest way to ensure
the variables are fixed width.
Casting to (unsigned long) vs PRIu32 is
just a matter of preference.
--
Best Regards
Masahiro Yamada
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-01-10 17:35 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-03 7:30 [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Masahiro Yamada
2025-01-03 7:30 ` [PATCH 2/6] genksyms: fix memory leak when the same symbol is read from *.symref file Masahiro Yamada
2025-01-03 12:50 ` Markus Elfring
2025-01-03 7:30 ` [PATCH 3/6] genksyms: reduce the indentation in the for-loop in __add_symbol() Masahiro Yamada
2025-01-03 7:30 ` [PATCH 4/6] genksyms: refactor the return points " Masahiro Yamada
2025-01-03 7:30 ` [PATCH 5/6] genksyms: use generic macros for hash table implementation Masahiro Yamada
2025-01-03 7:30 ` [PATCH 6/6] genksyms: use uint32_t instead of unsigned long for calculating CRC Masahiro Yamada
2025-01-03 8:34 ` David Laight
2025-01-10 17:34 ` Masahiro Yamada
2025-01-03 11:56 ` [PATCH 1/6] genksyms: fix memory leak when the same symbol is added from source Markus Elfring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox