* [PATCH] sparse: add support for __VA_OPT__
[not found] <cover.1771930766.git.dan.carpenter@linaro.org>
@ 2026-02-24 11:07 ` Dan Carpenter
2026-02-24 11:16 ` Ben Dooks
2026-02-25 2:39 ` Chris Li
0 siblings, 2 replies; 42+ messages in thread
From: Dan Carpenter @ 2026-02-24 11:07 UTC (permalink / raw)
To: linux-sparse, Chris Li
Cc: Linus Torvalds, Ricardo Ribalda, Hans Verkuil, Ben Dooks, Al Viro,
Richard Fitzgerald
The linux kernel has started using __VA_OPT__ so lets add support for it.
What it does is it adds an optional thing, normally a comma, if the
__VA_ARGS__ parameter is not empty. So if you have at least one argument
but possibly more then you could create a macro like:
#define test_args(a, ...) printf(a __VA_OPT__(,) __VA_ARGS__)
If you call test_args("foo\n") it expands to:
printf("foo\n");
but if you pass two arguments test_args("foo %d\n", 2) it expands to:
printf("foo\n" , 2);
I guess if you wanted instead of a comma then you could have an empty
__VA_OPT__() or you could pass random things like __VA_OPT__(a b c). I
don't know why you would do that. In this case, "a" is a macro argument
so that has to be expanded out.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
---
ident-list.h | 1 +
pre-process.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 53 insertions(+)
diff --git a/ident-list.h b/ident-list.h
index d65668108385..bffc4038a75d 100644
--- a/ident-list.h
+++ b/ident-list.h
@@ -66,6 +66,7 @@ IDENT(c_static_assert);
__IDENT(pragma_ident, "__pragma__", 0);
__IDENT(_Pragma_ident, "_Pragma", 0);
__IDENT(__VA_ARGS___ident, "__VA_ARGS__", 0);
+__IDENT(__VA_OPT___ident, "__VA_OPT__", 0);
__IDENT(__func___ident, "__func__", 0);
__IDENT(__FUNCTION___ident, "__FUNCTION__", 0);
__IDENT(__PRETTY_FUNCTION___ident, "__PRETTY_FUNCTION__", 0);
diff --git a/pre-process.c b/pre-process.c
index 05a5a79396a8..94710301ce17 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -643,11 +643,49 @@ static int handle_kludge(struct token **p, struct arg *args)
}
}
+static struct token *get_VA_OPT(struct token **p, struct arg *args)
+{
+ struct token *t = (*p)->next;
+ const char *expected;
+ struct token *ret = NULL;
+ struct token *dup, *tail;
+
+ if (token_type(t) != TOKEN_SPECIAL || t->special != '(') {
+ expected = "(";
+ goto error;
+ }
+ while (true) {
+ t = t->next;
+ if (eof_token(t)) {
+ expected = ")";
+ goto error;
+ }
+ if (token_type(t) == TOKEN_SPECIAL &&
+ t->special == ')')
+ break;
+ dup = dup_token(t, &(*p)->pos);
+ if (!ret)
+ ret = dup;
+ else
+ tail->next = dup;
+ tail = dup;
+ }
+
+ if (tail)
+ tail->next = &eof_token_entry;
+ *p = t;
+ return ret;
+error:
+ sparse_error(t->pos, "__VA_OPT__ error: expected '%s'", expected);
+ return NULL;
+}
+
static struct token **substitute(struct token **list, struct token *body, struct arg *args)
{
struct position *base_pos = &(*list)->pos;
int *count;
enum {Normal, Placeholder, Concat} state = Normal;
+ struct token *va_opt = NULL;
for (; !eof_token(body); body = body->next) {
struct token *added, *arg;
@@ -697,6 +735,10 @@ static struct token **substitute(struct token **list, struct token *body, struct
case TOKEN_MACRO_ARGUMENT:
arg = args[body->argnum].expanded;
+ if (va_opt && !eof_token(arg)) {
+ list = substitute(list, va_opt, args);
+ va_opt = NULL;
+ }
count = &args[body->argnum].n_normal;
if (eof_token(arg)) {
state = Normal;
@@ -716,6 +758,11 @@ static struct token **substitute(struct token **list, struct token *body, struct
continue;
case TOKEN_IDENT:
+ if (body->ident == &__VA_OPT___ident) {
+ va_opt = get_VA_OPT(&body, args);
+ continue;
+ }
+
added = dup_token(body, base_pos);
if (added->ident->tainted)
added->pos.noexpand = 1;
@@ -1234,6 +1281,8 @@ static struct token *parse_arguments(struct token *list)
while (token_type(arg) == TOKEN_IDENT) {
if (arg->ident == &__VA_ARGS___ident)
goto Eva_args;
+ if (arg->ident == &__VA_OPT___ident)
+ goto Eva_opt;
if (!++count->normal)
goto Eargs;
next = arg->next;
@@ -1312,6 +1361,9 @@ Enotclosed:
Eva_args:
sparse_error(arg->pos, "__VA_ARGS__ can only appear in the expansion of a C99 variadic macro");
return NULL;
+Eva_opt:
+ sparse_error(arg->pos, "__VA_OPT__ can only appear in the expansion of a C99 variadic macro");
+ return NULL;
Eargs:
sparse_error(arg->pos, "too many arguments in macro definition");
return NULL;
--
2.51.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] sparse: add support for __VA_OPT__
2026-02-24 11:07 ` [PATCH] sparse: add support for __VA_OPT__ Dan Carpenter
@ 2026-02-24 11:16 ` Ben Dooks
2026-02-24 11:56 ` Dan Carpenter
2026-02-25 2:39 ` Chris Li
1 sibling, 1 reply; 42+ messages in thread
From: Ben Dooks @ 2026-02-24 11:16 UTC (permalink / raw)
To: Dan Carpenter, linux-sparse, Chris Li
Cc: Linus Torvalds, Ricardo Ribalda, Hans Verkuil, Al Viro,
Richard Fitzgerald
On 24/02/2026 11:07, Dan Carpenter wrote:
> The linux kernel has started using __VA_OPT__ so lets add support for it.
>
> What it does is it adds an optional thing, normally a comma, if the
> __VA_ARGS__ parameter is not empty. So if you have at least one argument
> but possibly more then you could create a macro like:
>
> #define test_args(a, ...) printf(a __VA_OPT__(,) __VA_ARGS__)
>
> If you call test_args("foo\n") it expands to:
>
> printf("foo\n");
>
> but if you pass two arguments test_args("foo %d\n", 2) it expands to:
>
> printf("foo\n" , 2);
>
> I guess if you wanted instead of a comma then you could have an empty
> __VA_OPT__() or you could pass random things like __VA_OPT__(a b c). I
> don't know why you would do that. In this case, "a" is a macro argument
> so that has to be expanded out.
>
> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
>
This wouldn't apply to the current tree using git-am.
Current head for me is 37156835e3d725b6d750f000be33ba3814bb2310
$ git am < \[PATCH\]\ sparse\:\ add\ support\ for\ __VA_OPT__\ -\ Dan\
Carpenter\ \<dan.carpenter@linaro.org\>\ -\ 2026-02-24\ 1107.eml
Applying: sparse: add support for __VA_OPT__
error: patch failed: ident-list.h:66
error: ident-list.h: patch does not apply
error: patch failed: pre-process.c:643
error: pre-process.c: patch does not apply
Patch failed at 0001 sparse: add support for __VA_OPT__
--
Ben Dooks http://www.codethink.co.uk/
Senior Engineer Codethink - Providing Genius
https://www.codethink.co.uk/privacy.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] sparse: add support for __VA_OPT__
2026-02-24 11:16 ` Ben Dooks
@ 2026-02-24 11:56 ` Dan Carpenter
2026-02-24 12:42 ` Richard Fitzgerald
0 siblings, 1 reply; 42+ messages in thread
From: Dan Carpenter @ 2026-02-24 11:56 UTC (permalink / raw)
To: Ben Dooks
Cc: linux-sparse, Chris Li, Linus Torvalds, Ricardo Ribalda,
Hans Verkuil, Al Viro, Richard Fitzgerald
Oh, sorry, Al's branch changed that code more than I imagined. I would
resend the patch, but I need to figure out this GCC warning...
CC pre-process.o
pre-process.c: In function ‘substitute’:
pre-process.c:785:16: warning: function may return address of local variable [-Wreturn-local-addr]
785 | return list;
| ^~~~
pre-process.c:687:31: note: declared here
687 | struct token *added, *arg;
| ^~~~~
Anyway, here is what I've forward ported.
regards,
dan carpenter
---
ident-list.h | 1 +
pre-process.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+)
diff --git a/ident-list.h b/ident-list.h
index 3c08e8ca9aa4..556d4050c88d 100644
--- a/ident-list.h
+++ b/ident-list.h
@@ -65,6 +65,7 @@ IDENT(c_generic_selections);
IDENT(c_static_assert);
__IDENT(pragma_ident, "__pragma__", 0);
__IDENT(__VA_ARGS___ident, "__VA_ARGS__", 0);
+__IDENT(__VA_OPT___ident, "__VA_OPT__", 0);
__IDENT(__func___ident, "__func__", 0);
__IDENT(__FUNCTION___ident, "__FUNCTION__", 0);
__IDENT(__PRETTY_FUNCTION___ident, "__PRETTY_FUNCTION__", 0);
diff --git a/pre-process.c b/pre-process.c
index 4e322855d600..364c68489a84 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -638,11 +638,50 @@ static int handle_kludge(const struct token **p, struct arg *args)
}
}
+static struct token *get_VA_OPT(const struct token **p, struct arg *args)
+{
+ struct position base_pos = (*p)->pos;
+ struct token *t = (*p)->next;
+ const char *expected;
+ struct token *ret = NULL;
+ struct token *dup, *tail = NULL;
+
+ if (token_type(t) != TOKEN_SPECIAL || t->special != '(') {
+ expected = "(";
+ goto error;
+ }
+ while (true) {
+ t = t->next;
+ if (eof_token(t)) {
+ expected = ")";
+ goto error;
+ }
+ if (token_type(t) == TOKEN_SPECIAL &&
+ t->special == ')')
+ break;
+ dup = dup_token(t, &base_pos);
+ if (!ret)
+ ret = dup;
+ else
+ tail->next = dup;
+ tail = dup;
+ }
+
+ if (tail)
+ tail->next = &eof_token_entry;
+ *p = t;
+ return ret;
+error:
+ sparse_error(t->pos, "__VA_OPT__ error: expected '%s'", expected);
+ return NULL;
+}
+
static struct token **substitute(struct token **list, const struct token *body, struct arg *args)
{
struct position *base_pos = &(*list)->pos;
int *count;
enum {Normal, Placeholder, Concat} state = Normal;
+ struct token *va_opt = NULL;
for (; !eof_token(body); body = body->next) {
struct token *added, *arg;
@@ -692,6 +731,10 @@ static struct token **substitute(struct token **list, const struct token *body,
case TOKEN_MACRO_ARGUMENT:
arg = args[body->argnum].expanded;
+ if (va_opt && !eof_token(arg)) {
+ list = substitute(list, va_opt, args);
+ va_opt = NULL;
+ }
count = &args[body->argnum].n_normal;
if (eof_token(arg)) {
state = Normal;
@@ -711,6 +754,11 @@ static struct token **substitute(struct token **list, const struct token *body,
continue;
default:
+ if (token_type(body) == TOKEN_IDENT &&
+ body->ident == &__VA_OPT___ident) {
+ va_opt = get_VA_OPT(&body, args);
+ continue;
+ }
added = dup_token(body, base_pos);
if (token_type(body) == TOKEN_IDENT &&
added->ident->tainted)
@@ -1094,6 +1142,8 @@ static struct token *parse_arguments(struct token *list)
while (token_type(arg) == TOKEN_IDENT) {
if (arg->ident == &__VA_ARGS___ident)
goto Eva_args;
+ if (arg->ident == &__VA_OPT___ident)
+ goto Eva_opt;
if (!++count->normal)
goto Eargs;
next = arg->next;
@@ -1172,6 +1222,9 @@ Enotclosed:
Eva_args:
sparse_error(arg->pos, "__VA_ARGS__ can only appear in the expansion of a C99 variadic macro");
return NULL;
+Eva_opt:
+ sparse_error(arg->pos, "__VA_OPT__ can only appear in the expansion of a C99 variadic macro");
+ return NULL;
Eargs:
sparse_error(arg->pos, "too many arguments in macro definition");
return NULL;
--
2.51.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] sparse: add support for __VA_OPT__
2026-02-24 11:56 ` Dan Carpenter
@ 2026-02-24 12:42 ` Richard Fitzgerald
2026-02-24 13:15 ` Ben Dooks
0 siblings, 1 reply; 42+ messages in thread
From: Richard Fitzgerald @ 2026-02-24 12:42 UTC (permalink / raw)
To: Dan Carpenter, Ben Dooks
Cc: linux-sparse, Chris Li, Linus Torvalds, Ricardo Ribalda,
Hans Verkuil, Al Viro
On 24/02/2026 11:56 am, Dan Carpenter wrote:
> Oh, sorry, Al's branch changed that code more than I imagined. I would
> resend the patch, but I need to figure out this GCC warning...
>
> CC pre-process.o
> pre-process.c: In function ‘substitute’:
> pre-process.c:785:16: warning: function may return address of local variable [-Wreturn-local-addr]
> 785 | return list;
> | ^~~~
> pre-process.c:687:31: note: declared here
> 687 | struct token *added, *arg;
> | ^~~~~
>
> Anyway, here is what I've forward ported.
>
> regards,
> dan carpenter
>
Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] sparse: add support for __VA_OPT__
2026-02-24 12:42 ` Richard Fitzgerald
@ 2026-02-24 13:15 ` Ben Dooks
0 siblings, 0 replies; 42+ messages in thread
From: Ben Dooks @ 2026-02-24 13:15 UTC (permalink / raw)
To: Richard Fitzgerald, Dan Carpenter
Cc: linux-sparse, Chris Li, Linus Torvalds, Ricardo Ribalda,
Hans Verkuil, Al Viro
On 24/02/2026 12:42, Richard Fitzgerald wrote:
> On 24/02/2026 11:56 am, Dan Carpenter wrote:
>> Oh, sorry, Al's branch changed that code more than I imagined. I would
>> resend the patch, but I need to figure out this GCC warning...
>>
>> CC pre-process.o
>> pre-process.c: In function ‘substitute’:
>> pre-process.c:785:16: warning: function may return address of local
>> variable [-Wreturn-local-addr]
>> 785 | return list;
>> | ^~~~
>> pre-process.c:687:31: note: declared here
>> 687 | struct token *added, *arg;
>> | ^~~~~
>>
>> Anyway, here is what I've forward ported.
>>
>> regards,
>> dan carpenter
>>
>
> Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Tested-by: Ben Dooks <ben.dooks@codethink.co.uk>
--
Ben Dooks http://www.codethink.co.uk/
Senior Engineer Codethink - Providing Genius
https://www.codethink.co.uk/privacy.html
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] sparse: add support for __VA_OPT__
2026-02-24 11:07 ` [PATCH] sparse: add support for __VA_OPT__ Dan Carpenter
2026-02-24 11:16 ` Ben Dooks
@ 2026-02-25 2:39 ` Chris Li
2026-02-25 3:36 ` Al Viro
1 sibling, 1 reply; 42+ messages in thread
From: Chris Li @ 2026-02-25 2:39 UTC (permalink / raw)
To: Dan Carpenter
Cc: linux-sparse, Linus Torvalds, Ricardo Ribalda, Hans Verkuil,
Ben Dooks, Al Viro, Richard Fitzgerald, Xuanhao Zhang
Hi Dan,
Thanks for the patch.
BTW, can you CC my personal email next time? Thanks.
On Tue, Feb 24, 2026 at 3:12 AM Dan Carpenter <dan.carpenter@linaro.org> wrote:
>
> The linux kernel has started using __VA_OPT__ so lets add support for it.
>
> What it does is it adds an optional thing, normally a comma, if the
> __VA_ARGS__ parameter is not empty. So if you have at least one argument
> but possibly more then you could create a macro like:
>
> #define test_args(a, ...) printf(a __VA_OPT__(,) __VA_ARGS__)
>
> If you call test_args("foo\n") it expands to:
>
> printf("foo\n");
>
> but if you pass two arguments test_args("foo %d\n", 2) it expands to:
>
> printf("foo\n" , 2);
>
> I guess if you wanted instead of a comma then you could have an empty
> __VA_OPT__() or you could pass random things like __VA_OPT__(a b c). I
> don't know why you would do that. In this case, "a" is a macro argument
> so that has to be expanded out.
>
> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
> ---
> ident-list.h | 1 +
> pre-process.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 53 insertions(+)
>
> diff --git a/ident-list.h b/ident-list.h
> index d65668108385..bffc4038a75d 100644
> --- a/ident-list.h
> +++ b/ident-list.h
> @@ -66,6 +66,7 @@ IDENT(c_static_assert);
> __IDENT(pragma_ident, "__pragma__", 0);
> __IDENT(_Pragma_ident, "_Pragma", 0);
> __IDENT(__VA_ARGS___ident, "__VA_ARGS__", 0);
> +__IDENT(__VA_OPT___ident, "__VA_OPT__", 0);
> __IDENT(__func___ident, "__func__", 0);
> __IDENT(__FUNCTION___ident, "__FUNCTION__", 0);
> __IDENT(__PRETTY_FUNCTION___ident, "__PRETTY_FUNCTION__", 0);
> diff --git a/pre-process.c b/pre-process.c
> index 05a5a79396a8..94710301ce17 100644
> --- a/pre-process.c
> +++ b/pre-process.c
> @@ -643,11 +643,49 @@ static int handle_kludge(struct token **p, struct arg *args)
> }
> }
>
> +static struct token *get_VA_OPT(struct token **p, struct arg *args)
> +{
> + struct token *t = (*p)->next;
> + const char *expected;
> + struct token *ret = NULL;
> + struct token *dup, *tail;
> +
> + if (token_type(t) != TOKEN_SPECIAL || t->special != '(') {
> + expected = "(";
> + goto error;
> + }
> + while (true) {
> + t = t->next;
> + if (eof_token(t)) {
> + expected = ")";
> + goto error;
> + }
> + if (token_type(t) == TOKEN_SPECIAL &&
> + t->special == ')')
> + break;
I think this approach has some limitations. You can use the
collect_arguments() to collect and expand the arguments for
__VA_OPT__().
I did a small test case to show that gcc actually expand the arguments
inside the __VA_OPT__().
============ terminal ===============
$ cat ~/tmp/va_opt.c
#define A(B) ,
#define foo(fmt, ...) printk(fmt __VA_OPT__(A(x)) __VA_ARGS__)
foo("\n");
foo("\n", "a");
$ gcc -E ~/tmp/va_opt.c
# 0 "/home/chrisl/tmp/va_opt.c"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "/home/chrisl/tmp/va_opt.c"
printk("\n" );
printk("\n" , "a");
========== terminal =============
That shows that gcc correctly expand the A() macro inside __VA_OPT__(A(x)).
I have not test it on your program yet, reading your patch I assume it
will not expand the A() here.
One idea is to add an expand function expand___VA_OPT__, similar to
expand_has_feature(). Register it in the dynamic expand macro array.
That way you get the collect_arguments() for free and it behaves just
like a builtin macro expansion. Just duplicate the collected arg list
to the current token, if there are extra arguments.
If you go that route, likely don't need the manual __VA_OPT__ident parsing.
I haven't had time to code it myself yet; it's just an idea.
Chris
> + dup = dup_token(t, &(*p)->pos);
> + if (!ret)
> + ret = dup;
> + else
> + tail->next = dup;
> + tail = dup;
> + }
> +
> + if (tail)
> + tail->next = &eof_token_entry;
> + *p = t;
> + return ret;
> +error:
> + sparse_error(t->pos, "__VA_OPT__ error: expected '%s'", expected);
> + return NULL;
> +}
> +
> static struct token **substitute(struct token **list, struct token *body, struct arg *args)
> {
> struct position *base_pos = &(*list)->pos;
> int *count;
> enum {Normal, Placeholder, Concat} state = Normal;
> + struct token *va_opt = NULL;
>
> for (; !eof_token(body); body = body->next) {
> struct token *added, *arg;
> @@ -697,6 +735,10 @@ static struct token **substitute(struct token **list, struct token *body, struct
>
> case TOKEN_MACRO_ARGUMENT:
> arg = args[body->argnum].expanded;
> + if (va_opt && !eof_token(arg)) {
> + list = substitute(list, va_opt, args);
> + va_opt = NULL;
> + }
> count = &args[body->argnum].n_normal;
> if (eof_token(arg)) {
> state = Normal;
> @@ -716,6 +758,11 @@ static struct token **substitute(struct token **list, struct token *body, struct
> continue;
>
> case TOKEN_IDENT:
> + if (body->ident == &__VA_OPT___ident) {
> + va_opt = get_VA_OPT(&body, args);
> + continue;
> + }
> +
> added = dup_token(body, base_pos);
> if (added->ident->tainted)
> added->pos.noexpand = 1;
> @@ -1234,6 +1281,8 @@ static struct token *parse_arguments(struct token *list)
> while (token_type(arg) == TOKEN_IDENT) {
> if (arg->ident == &__VA_ARGS___ident)
> goto Eva_args;
> + if (arg->ident == &__VA_OPT___ident)
> + goto Eva_opt;
> if (!++count->normal)
> goto Eargs;
> next = arg->next;
> @@ -1312,6 +1361,9 @@ Enotclosed:
> Eva_args:
> sparse_error(arg->pos, "__VA_ARGS__ can only appear in the expansion of a C99 variadic macro");
> return NULL;
> +Eva_opt:
> + sparse_error(arg->pos, "__VA_OPT__ can only appear in the expansion of a C99 variadic macro");
> + return NULL;
> Eargs:
> sparse_error(arg->pos, "too many arguments in macro definition");
> return NULL;
> --
> 2.51.0
>
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] sparse: add support for __VA_OPT__
2026-02-25 2:39 ` Chris Li
@ 2026-02-25 3:36 ` Al Viro
2026-02-25 5:29 ` [RFC PATCH] pre-process: add __VA_OPT__ support Eric Zhang
2026-02-25 7:05 ` [PATCH] sparse: add support for __VA_OPT__ Chris Li
0 siblings, 2 replies; 42+ messages in thread
From: Al Viro @ 2026-02-25 3:36 UTC (permalink / raw)
To: Chris Li
Cc: Dan Carpenter, linux-sparse, Linus Torvalds, Ricardo Ribalda,
Hans Verkuil, Ben Dooks, Richard Fitzgerald, Xuanhao Zhang
On Tue, Feb 24, 2026 at 06:39:57PM -0800, Chris Li wrote:
> > I guess if you wanted instead of a comma then you could have an empty
> > __VA_OPT__() or you could pass random things like __VA_OPT__(a b c). I
> > don't know why you would do that. In this case, "a" is a macro argument
> > so that has to be expanded out.
<sarcasm>
Then perhaps reading the standard might prove enlightening - possibly
due to examples that might be in there, or seeing the actual description
of semantics? https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf
is there and searching for __VA_OPT__ would immediately get you this:
#define SDEF(sname, ...) S sname __VA_OPT__(= { __VA_ARGS__ })
SDEF(foo); // replaced by S foo;
SDEF(bar, 1, 2); // replaced by S bar = { 1, 2 };
Would that answer your question?
</sarcasm>
> One idea is to add an expand function expand___VA_OPT__, similar to
> expand_has_feature(). Register it in the dynamic expand macro array.
> That way you get the collect_arguments() for free and it behaves just
> like a builtin macro expansion. Just duplicate the collected arg list
> to the current token, if there are extra arguments.
Sorry, no go; for one thing, #__VA_OPT__() won't be dealt with that way,
for another there's fun with foo ## __VA_OPT__(arg) (argument is expanded
*and* subjected to ## - yes, it's possible now).
I have something resembling a workable approach, but there's nasty corner
case when you mix it with the side effects; that's impossible in standard
C, but gcc has the sodding __COUNTER__ thing and _that_ makes life really
interesting.
If there's __VA_OPT__ in the body, we want to expand vararg whether it
occurs in the body or not.
We want to expand each argument that is present in the body.
Question: should we expand an argument that occurs *only* under __VA_OPT__?
Note that "expand and discard" is *not* a no-op - expansion of __COUNTER__
will have visible side effects. What's more, gcc and clang diverge there.
Another kind of side effect is possible in standard C: argument substitution
might fail when attempted. And gcc is arguably broken there - see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123325 for fun details.
clang handles that one sanely...
I got stalled waiting for gcc folks to respond, then sidetracked to other
stuff. I'll resurrect that stuff later this week.
^ permalink raw reply [flat|nested] 42+ messages in thread
* [RFC PATCH] pre-process: add __VA_OPT__ support
2026-02-25 3:36 ` Al Viro
@ 2026-02-25 5:29 ` Eric Zhang
2026-02-25 6:40 ` Al Viro
2026-02-25 7:05 ` [PATCH] sparse: add support for __VA_OPT__ Chris Li
1 sibling, 1 reply; 42+ messages in thread
From: Eric Zhang @ 2026-02-25 5:29 UTC (permalink / raw)
To: linux-sparse
Cc: dan.carpenter, viro, chriscli, ben.dooks, rf, torvalds,
Eric Zhang
Add __VA_OPT__ support (C23 6.10.5) including some tests.
At expansion time, substitute() checks whether the variadic
argument is empty: if so, skip the region between the markers;
otherwise, process the enclosed tokens normally.
Signed-off-by: Eric Zhang <zxh@xh-zhang.com>
---
Discussed this with Chris during lunch yesterday and got curious
about the problem, so made a few attempts. Introducing
TOKEN_VA_OPT_START/END feels a bit like an anti-pattern, but I
couldn't find a cleaner way to handle it without new token types.
Happy to hear suggestions.
Note: this does not handle the __COUNTER__ side-effect issue
(arguments under __VA_OPT__ are expanded even when discarded).
ident-list.h | 1 +
pre-process.c | 132 ++++++++++++++++++++++++
token.h | 2 +
tokenize.c | 6 ++
validation/preprocessor/va-opt-errors.c | 38 +++++++
validation/preprocessor/va-opt.c | 66 ++++++++++++
6 files changed, 245 insertions(+)
create mode 100644 validation/preprocessor/va-opt-errors.c
create mode 100644 validation/preprocessor/va-opt.c
diff --git a/ident-list.h b/ident-list.h
index 3c08e8ca..556d4050 100644
--- a/ident-list.h
+++ b/ident-list.h
@@ -65,6 +65,7 @@ IDENT(c_generic_selections);
IDENT(c_static_assert);
__IDENT(pragma_ident, "__pragma__", 0);
__IDENT(__VA_ARGS___ident, "__VA_ARGS__", 0);
+__IDENT(__VA_OPT___ident, "__VA_OPT__", 0);
__IDENT(__func___ident, "__func__", 0);
__IDENT(__FUNCTION___ident, "__FUNCTION__", 0);
__IDENT(__PRETTY_FUNCTION___ident, "__PRETTY_FUNCTION__", 0);
diff --git a/pre-process.c b/pre-process.c
index 4e322855..ec4c7b98 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -643,13 +643,37 @@ static struct token **substitute(struct token **list, const struct token *body,
struct position *base_pos = &(*list)->pos;
int *count;
enum {Normal, Placeholder, Concat} state = Normal;
+ int va_opt_ws = 0;
for (; !eof_token(body); body = body->next) {
struct token *added, *arg;
struct token **tail;
const struct token *t;
+ struct arg *va;
+ int is_empty;
switch (token_type(body)) {
+ case TOKEN_VA_OPT_START:
+ va = &args[body->argnum];
+ is_empty = (!va->arg || eof_token(va->arg)) &&
+ (!va->expanded || eof_token(va->expanded));
+ if (is_empty) {
+ /* empty varargs: skip to end marker */
+ while (token_type(body) != TOKEN_VA_OPT_END)
+ body = body->next;
+ if (state == Concat)
+ state = Normal;
+ else
+ state = Placeholder;
+ continue;
+ }
+ /* non-empty: skip marker, transfer whitespace */
+ va_opt_ws = body->pos.whitespace;
+ continue;
+
+ case TOKEN_VA_OPT_END:
+ continue;
+
case TOKEN_GNU_KLUDGE:
/*
* GNU kludge: if we had <comma>##<vararg>, behaviour
@@ -728,6 +752,10 @@ static struct token **substitute(struct token **list, const struct token *body,
if (tail != &added->next)
list = tail;
} else {
+ if (va_opt_ws) {
+ added->pos.whitespace = va_opt_ws;
+ va_opt_ws = 0;
+ }
*list = added;
list = tail;
}
@@ -747,6 +775,8 @@ static int expand(struct token **list, struct symbol *sym)
int nargs = sym->arglist ? sym->arglist->count.normal : 0;
struct arg args[nargs];
+ memset(args, 0, sizeof(args));
+
if (expanding->tainted) {
token->pos.noexpand = 1;
return 1;
@@ -1019,6 +1049,7 @@ static int token_different(struct token *t1, struct token *t2)
case TOKEN_UNTAINT:
case TOKEN_CONCAT:
case TOKEN_GNU_KLUDGE:
+ case TOKEN_VA_OPT_END:
different = 0;
break;
case TOKEN_NUMBER:
@@ -1030,6 +1061,7 @@ static int token_different(struct token *t1, struct token *t2)
case TOKEN_MACRO_ARGUMENT:
case TOKEN_QUOTED_ARGUMENT:
case TOKEN_STR_ARGUMENT:
+ case TOKEN_VA_OPT_START:
different = t1->argnum != t2->argnum;
break;
case TOKEN_CHAR_EMBEDDED_0 ... TOKEN_CHAR_EMBEDDED_3:
@@ -1291,21 +1323,94 @@ Econcat:
return NULL;
}
+static int find_vararg_index(struct token *arglist)
+{
+ struct token *p;
+ int nr = 0;
+
+ if (!arglist)
+ return -1;
+ for (p = arglist->next; !eof_token(p); p = p->next->next, nr++) {
+ if (p->next->count.vararg)
+ return nr;
+ }
+ return -1;
+}
+
static struct token *parse_expansion(struct token *expansion, struct token *arglist, struct ident *name)
{
struct token *token = expansion;
struct token **p;
+ struct token *va_opt_start = NULL;
+ int in_va_opt = 0;
+ int va_opt_paren_depth = 0;
+ int vararg_index = find_vararg_index(arglist);
if (match_op(token, SPECIAL_HASHHASH))
goto Econcat;
for (p = &expansion; !eof_token(token); p = &token->next, token = *p) {
+ /* Handle __VA_OPT__(...) */
+ if (token_type(token) == TOKEN_IDENT && token->ident == &__VA_OPT___ident) {
+ struct token *next = token->next;
+
+ if (vararg_index < 0)
+ goto Eva_opt_nonva;
+ if (in_va_opt)
+ goto Eva_opt_nested;
+ if (!match_op(next, '('))
+ goto Eva_opt_paren;
+
+ /* Convert __VA_OPT__ token to TOKEN_VA_OPT_START */
+ token_type(token) = TOKEN_VA_OPT_START;
+ token->argnum = vararg_index;
+
+ /* Remove the '(' token from the list */
+ token->next = next->next;
+ __free_token(next);
+
+ /* C23: ## cannot be first token inside __VA_OPT__ */
+ if (match_op(token->next, SPECIAL_HASHHASH))
+ goto Eva_opt_hashhash;
+
+ va_opt_start = token;
+ in_va_opt = 1;
+ va_opt_paren_depth = 0;
+ continue;
+ }
+
+ /* Track parentheses inside __VA_OPT__(...) */
+ if (in_va_opt) {
+ if (match_op(token, '(')) {
+ va_opt_paren_depth++;
+ } else if (match_op(token, ')')) {
+ if (va_opt_paren_depth == 0) {
+ /* C23: ## cannot be last inside __VA_OPT__ */
+ if (token_type(va_opt_start) == TOKEN_CONCAT)
+ goto Eva_opt_hashhash;
+ /* This is the closing ) of __VA_OPT__ */
+ token_type(token) = TOKEN_VA_OPT_END;
+ in_va_opt = 0;
+ continue;
+ }
+ va_opt_paren_depth--;
+ }
+ }
+
if (match_op(token, '#')) {
token = handle_hash(p, arglist);
if (!token)
return NULL;
}
if (match_op(token->next, SPECIAL_HASHHASH)) {
+ /* C23: ## cannot be last inside __VA_OPT__ */
+ if (in_va_opt && va_opt_paren_depth == 0) {
+ struct token *t = token->next;
+ while (match_op(t, SPECIAL_HASHHASH))
+ t = t->next;
+ if (match_op(t, ')'))
+ goto Eva_opt_hashhash;
+ }
token = handle_hashhash(token, arglist);
if (!token)
return NULL;
@@ -1314,7 +1419,15 @@ static struct token *parse_expansion(struct token *expansion, struct token *argl
}
if (token_type(token) == TOKEN_ERROR)
goto Earg;
+ if (in_va_opt)
+ va_opt_start = token;
}
+
+ if (in_va_opt) {
+ sparse_error(expansion->pos, "unterminated __VA_OPT__");
+ return NULL;
+ }
+
token = alloc_token(&expansion->pos);
token_type(token) = TOKEN_UNTAINT;
token->ident = name;
@@ -1328,8 +1441,21 @@ Econcat:
Earg:
sparse_error(token->pos, "too many instances of argument in body");
return NULL;
+Eva_opt_nonva:
+ sparse_error(token->pos, "__VA_OPT__ can only appear in the expansion of a variadic macro");
+ return NULL;
+Eva_opt_nested:
+ sparse_error(token->pos, "__VA_OPT__ may not be nested");
+ return NULL;
+Eva_opt_paren:
+ sparse_error(token->pos, "__VA_OPT__ must be followed by '('");
+ return NULL;
+Eva_opt_hashhash:
+ sparse_error(token->pos, "'##' cannot appear at either end of __VA_OPT__");
+ return NULL;
}
+
static int do_define(struct position pos, struct token *token, struct ident *name,
struct token *arglist, struct token *expansion, int attr)
{
@@ -2316,6 +2442,12 @@ static void dump_macro(struct symbol *sym)
case TOKEN_CONCAT:
printf("##");
break;
+ case TOKEN_VA_OPT_START:
+ printf("__VA_OPT__(");
+ break;
+ case TOKEN_VA_OPT_END:
+ printf(")");
+ break;
case TOKEN_STR_ARGUMENT:
printf("#");
/* fall-through */
diff --git a/token.h b/token.h
index 9000e0cb..8e05672b 100644
--- a/token.h
+++ b/token.h
@@ -104,6 +104,8 @@ enum token_type {
TOKEN_QUOTED_ARGUMENT,
TOKEN_CONCAT,
TOKEN_GNU_KLUDGE,
+ TOKEN_VA_OPT_START,
+ TOKEN_VA_OPT_END,
TOKEN_UNTAINT,
TOKEN_ARG_COUNT,
TOKEN_IF,
diff --git a/tokenize.c b/tokenize.c
index 54ea348c..44b128b7 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -237,6 +237,12 @@ const char *show_token(const struct token *token)
sprintf(buffer, "<end of '%s'>", stream_name(token->pos.stream));
return buffer;
+ case TOKEN_VA_OPT_START:
+ return "__VA_OPT__(";
+
+ case TOKEN_VA_OPT_END:
+ return ")";
+
case TOKEN_UNTAINT:
sprintf(buffer, "<untaint>");
return buffer;
diff --git a/validation/preprocessor/va-opt-errors.c b/validation/preprocessor/va-opt-errors.c
new file mode 100644
index 00000000..392b272b
--- /dev/null
+++ b/validation/preprocessor/va-opt-errors.c
@@ -0,0 +1,38 @@
+/*
+ * __VA_OPT__ error cases (C23 6.10.5)
+ *
+ * Constraints from C23 (N3220 6.10.5):
+ * - __VA_OPT__ shall only occur in the replacement-list of a
+ * function-like macro that uses the ellipsis notation.
+ * - __VA_OPT__ shall not appear within its own replacement tokens.
+ * - ## shall not appear at either end of __VA_OPT__().
+ */
+
+/* non-variadic macro */
+#define NONVAR(x) __VA_OPT__(,)
+
+/* nested __VA_OPT__ */
+#define NESTED(...) __VA_OPT__(__VA_OPT__(x))
+
+/* not followed by ( */
+#define NOPAREN(...) __VA_OPT__ x
+
+/* ## at start */
+#define HASH_START(...) __VA_OPT__(## x)
+
+/* ## at end */
+#define HASH_END(...) __VA_OPT__(x ##)
+
+/*
+ * check-name: __VA_OPT__ errors (C23)
+ * check-command: sparse -E $file
+ * check-output-ignore
+ *
+ * check-error-start
+preprocessor/va-opt-errors.c:12:19: error: __VA_OPT__ can only appear in the expansion of a variadic macro
+preprocessor/va-opt-errors.c:15:32: error: __VA_OPT__ may not be nested
+preprocessor/va-opt-errors.c:18:22: error: __VA_OPT__ must be followed by '('
+preprocessor/va-opt-errors.c:21:25: error: '##' cannot appear at either end of __VA_OPT__
+preprocessor/va-opt-errors.c:24:34: error: '##' cannot appear at either end of __VA_OPT__
+ * check-error-end
+ */
diff --git a/validation/preprocessor/va-opt.c b/validation/preprocessor/va-opt.c
new file mode 100644
index 00000000..52814fc2
--- /dev/null
+++ b/validation/preprocessor/va-opt.c
@@ -0,0 +1,66 @@
+/*
+ * __VA_OPT__ support (C23 6.10.5)
+ */
+
+/* Basic: comma insertion */
+#define A(x, ...) x __VA_OPT__(,) __VA_ARGS__
+A(1)
+A(1, 2)
+A(1, 2, 3)
+
+/* Multiple tokens inside __VA_OPT__ */
+#define B(x, ...) x __VA_OPT__(+ __VA_ARGS__ + 0)
+B(1)
+B(1, 2)
+
+/* Empty __VA_OPT__ content (just controls comma) */
+#define C(...) start __VA_OPT__(, __VA_ARGS__) end
+C()
+C(a)
+C(a, b)
+
+/* __VA_OPT__ with stringify */
+#define D(x, ...) x __VA_OPT__(, #__VA_ARGS__)
+D(1)
+D(1, hello world)
+
+/* Named varargs with __VA_OPT__ */
+#define E(x, args...) x __VA_OPT__(,) args
+E(1)
+E(1, 2)
+
+/* Empty __VA_OPT__() */
+#define F(...) prefix __VA_OPT__() suffix
+F()
+F(1)
+
+/* default_gfp() pattern from the kernel */
+#define __default_gfp(a,...) a
+#define default_gfp(...) __default_gfp(__VA_ARGS__ __VA_OPT__(,) 999)
+default_gfp()
+default_gfp(42)
+
+/*
+ * check-name: __VA_OPT__ support (C23)
+ * check-command: sparse -E $file
+ *
+ * check-output-start
+
+1
+1 , 2
+1 , 2, 3
+1
+1 + 2 + 0
+start end
+start , a end
+start , a, b end
+1
+1 , "hello world"
+1
+1 , 2
+prefix suffix
+prefix suffix
+999
+42
+ * check-output-end
+ */
--
2.43.0
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-02-25 5:29 ` [RFC PATCH] pre-process: add __VA_OPT__ support Eric Zhang
@ 2026-02-25 6:40 ` Al Viro
2026-02-25 7:27 ` Al Viro
0 siblings, 1 reply; 42+ messages in thread
From: Al Viro @ 2026-02-25 6:40 UTC (permalink / raw)
To: Eric Zhang; +Cc: linux-sparse, dan.carpenter, chriscli, ben.dooks, rf, torvalds
On Tue, Feb 24, 2026 at 09:29:50PM -0800, Eric Zhang wrote:
> Add __VA_OPT__ support (C23 6.10.5) including some tests.
>
> At expansion time, substitute() checks whether the variadic
> argument is empty: if so, skip the region between the markers;
> otherwise, process the enclosed tokens normally.
>
> Signed-off-by: Eric Zhang <zxh@xh-zhang.com>
> ---
> Discussed this with Chris during lunch yesterday and got curious
> about the problem, so made a few attempts. Introducing
> TOKEN_VA_OPT_START/END feels a bit like an anti-pattern, but I
> couldn't find a cleaner way to handle it without new token types.
> Happy to hear suggestions.
Your variant will break with __VA_OPT__ following #. It won't do the
right thing with ## either, AFAICS, in case when __VA_OPT__ token list
is empty.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] sparse: add support for __VA_OPT__
2026-02-25 3:36 ` Al Viro
2026-02-25 5:29 ` [RFC PATCH] pre-process: add __VA_OPT__ support Eric Zhang
@ 2026-02-25 7:05 ` Chris Li
1 sibling, 0 replies; 42+ messages in thread
From: Chris Li @ 2026-02-25 7:05 UTC (permalink / raw)
To: Al Viro
Cc: Dan Carpenter, linux-sparse, Linus Torvalds, Ricardo Ribalda,
Hans Verkuil, Ben Dooks, Richard Fitzgerald, Xuanhao Zhang
"
On Tue, Feb 24, 2026 at 7:34 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Tue, Feb 24, 2026 at 06:39:57PM -0800, Chris Li wrote:
>
> > > I guess if you wanted instead of a comma then you could have an empty
> > > __VA_OPT__() or you could pass random things like __VA_OPT__(a b c). I
> > > don't know why you would do that. In this case, "a" is a macro argument
> > > so that has to be expanded out.
>
> <sarcasm>
> Then perhaps reading the standard might prove enlightening - possibly
> due to examples that might be in there, or seeing the actual description
> of semantics? https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf
> is there and searching for __VA_OPT__ would immediately get you this:
>
> #define SDEF(sname, ...) S sname __VA_OPT__(= { __VA_ARGS__ })
> SDEF(foo); // replaced by S foo;
> SDEF(bar, 1, 2); // replaced by S bar = { 1, 2 };
>
> Would that answer your question?
> </sarcasm>
>
> > One idea is to add an expand function expand___VA_OPT__, similar to
> > expand_has_feature(). Register it in the dynamic expand macro array.
> > That way you get the collect_arguments() for free and it behaves just
> > like a builtin macro expansion. Just duplicate the collected arg list
> > to the current token, if there are extra arguments.
>
> Sorry, no go; for one thing, #__VA_OPT__() won't be dealt with that way,
Ah, thanks for the great insight. I did not read the spec and trusted
the gcc behavior instead. I just learn from you that that gcc is buggy
in this regard.
Never mind my bad idea.
> for another there's fun with foo ## __VA_OPT__(arg) (argument is expanded
> *and* subjected to ## - yes, it's possible now).
That is very tricky. I just took a look at the "6.10.5.1" regarding
__VA_OPT__(). I don't have a good solution yet.
> I have something resembling a workable approach, but there's nasty corner
> case when you mix it with the side effects; that's impossible in standard
> C, but gcc has the sodding __COUNTER__ thing and _that_ makes life really
> interesting.
>
> If there's __VA_OPT__ in the body, we want to expand vararg whether it
> occurs in the body or not.
>
> We want to expand each argument that is present in the body.
>
> Question: should we expand an argument that occurs *only* under __VA_OPT__?
> Note that "expand and discard" is *not* a no-op - expansion of __COUNTER__
> will have visible side effects. What's more, gcc and clang diverge there.
>
> Another kind of side effect is possible in standard C: argument substitution
> might fail when attempted. And gcc is arguably broken there - see
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123325 for fun details.
> clang handles that one sanely...
>
> I got stalled waiting for gcc folks to respond, then sidetracked to other
> stuff. I'll resurrect that stuff later this week.
Lookin forward to your solutions.
Chris
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-02-25 6:40 ` Al Viro
@ 2026-02-25 7:27 ` Al Viro
2026-02-25 8:14 ` Eric Zhang
0 siblings, 1 reply; 42+ messages in thread
From: Al Viro @ 2026-02-25 7:27 UTC (permalink / raw)
To: Eric Zhang; +Cc: linux-sparse, dan.carpenter, chriscli, ben.dooks, rf, torvalds
On Wed, Feb 25, 2026 at 06:40:03AM +0000, Al Viro wrote:
> On Tue, Feb 24, 2026 at 09:29:50PM -0800, Eric Zhang wrote:
> > Add __VA_OPT__ support (C23 6.10.5) including some tests.
> >
> > At expansion time, substitute() checks whether the variadic
> > argument is empty: if so, skip the region between the markers;
> > otherwise, process the enclosed tokens normally.
> >
> > Signed-off-by: Eric Zhang <zxh@xh-zhang.com>
> > ---
> > Discussed this with Chris during lunch yesterday and got curious
> > about the problem, so made a few attempts. Introducing
> > TOKEN_VA_OPT_START/END feels a bit like an anti-pattern, but I
> > couldn't find a cleaner way to handle it without new token types.
> > Happy to hear suggestions.
>
> Your variant will break with __VA_OPT__ following #. It won't do the
> right thing with ## either, AFAICS, in case when __VA_OPT__ token list
> is empty.
Another problem is that having no __VA_ARGS__ in the body should *not*
be treated as "vararg is empty" - if there's a __VA_OPT__ in the body,
you must expand the vararg, no matter what.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-02-25 7:27 ` Al Viro
@ 2026-02-25 8:14 ` Eric Zhang
2026-02-25 22:18 ` Al Viro
0 siblings, 1 reply; 42+ messages in thread
From: Eric Zhang @ 2026-02-25 8:14 UTC (permalink / raw)
To: linux-sparse; +Cc: viro, dan.carpenter, chriscli, ben.dooks, rf, torvalds
On Wed, Feb 25, 2026 at 06:40:03AM +0000, Al Viro wrote:
> Your variant will break with __VA_OPT__ following #. It won't do the
> right thing with ## either, AFAICS, in case when __VA_OPT__ token list
> is empty.
Thanks for the review Al! Wow|Hummm, supprised now they support case
like foo ## __VA_OPT__(arg), and I just noticed you've pointed it out
in the previous reply.
Maybe split parse_expansion() into two passes could fix it, smth like
1. Convert __VA_OPT__(...) → TOKEN_VA_OPT_START...TOKEN_VA_OPT_END
2. Run the existing #/## handling, with handle_hash() and
handle_hashhash() taught to recognize the new markers
For # __VA_OPT__(), a stringify flag on TOKEN_VA_OPT_START could
signal substitute() to stringify or produce "" depending on whether
varargs are empty.
For ## adjacent to __VA_OPT__, the existing Concat/Placeholder state
machine in substitute() should handle the empty case naturally once
the ## is properly converted to TOKEN_CONCAT in pass 2.
> Another problem is that having no __VA_ARGS__ in the body should
> *not* be treated as "vararg is empty"
Missed in the test case, but it seems like it can work in the current
version.
> I got stalled waiting for gcc folks to respond, then sidetracked to
> other stuff. I'll resurrect that stuff later this week.
Then I will look forward to your approach :)
Eric
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-02-25 8:14 ` Eric Zhang
@ 2026-02-25 22:18 ` Al Viro
2026-02-26 7:29 ` Al Viro
0 siblings, 1 reply; 42+ messages in thread
From: Al Viro @ 2026-02-25 22:18 UTC (permalink / raw)
To: Eric Zhang; +Cc: linux-sparse, dan.carpenter, chriscli, ben.dooks, rf, torvalds
On Wed, Feb 25, 2026 at 12:14:12AM -0800, Eric Zhang wrote:
> For # __VA_OPT__(), a stringify flag on TOKEN_VA_OPT_START could
> signal substitute() to stringify or produce "" depending on whether
> varargs are empty.
IMO it's better to turn that into
TOKEN_VA_OPT[<token-list>]
and
TOKEN_QUOTED_VA_OPT[<token-list>]
with list hanging off the cannibalized token. Interpreter (substitute())
can easily keep track of where it is.
FWIW, the way they patched __VA_OPT__ into 10.5.1 is unfortunate -
I can understand wanting to keep the changes localized, but it ends
up very convoluted ;-/ In part it's due to the way ## and # evaluation
order is left unspecified, but... ouch.
Basically, #__VA_OPT__(<token-list>) is treated the following way:
it's "" if va-opt is suppressed, otherwise we
* do argument substitution in <token-list>
* [unspecified, but everyone does that] process # and ##
in the token-list.
* do *NOT* remove placemaker tokens
* do *NOT* rescan
* stringify the resulting token list, same way we would if
that token list had been passed as an argument and we were processing
#<that argument> (as per 6.10.5.2[3]).
## vs. __VA_OPT__ is similar; the tricky part is placemaker treatment.
For normal arguments it's either a non-empty list or a solitary placemaker;
here we might have placemakers with non-empty list.
#define F1(X, Y, ...) Y ## __VA_OPT__(X X) ## Y
#define F2(X, Y, ...) Y ## __VA_OPT__(X) ## Y
F1(,a,_)
F2(,a,_)
We get a ## placemaker placemaker ## a (i.e. a a) or a ## placemaker
## a (i.e. aa) respectively. Approach without explicit ## tokens
at expansion time is easy to adapt to that - we just interpret the
translated token-list hanging off __VA_OPT__, then return to the rest
of the body; state is updated as usual. Quoted __VA_OPT__ == run the
interpreter (starting from Normal) on the token-list, then feed that
to stringify() and use the result as if it came from quoted argument
(note that concatenation with previous token *is* possible -
L ## #__VA_OPT__(something) is not invalid).
Since __VA_OPT__ can't nest, it's easy to save the body->next into a
local variable, set body to body->va_opt_list and, after the main loop
check if that local variable is non-NULL. In that case we just set body
to that, clear that variable and go back to the beginning of the loop.
Quoted ones get a recursive call...
NOTE: substitute() is the second hottest loop in the entire thing; only
tokenizer is hotter. And gcc is too enthusiastic about the inlining
around that function, ending up with bad register spills, along with
a bunch of stalls. Worse, decisions are sensitive to minor changes in
places textually far away, making it a real bitch to deal with.
Makes for fun reordering the commits in local queue... ;-/
> > Another problem is that having no __VA_ARGS__ in the body should
> > *not* be treated as "vararg is empty"
>
> Missed in the test case, but it seems like it can work in the current
> version.
AFAICS you won't even try to expand it at expand_arguments(), so the
reference will remain NULL.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-02-25 22:18 ` Al Viro
@ 2026-02-26 7:29 ` Al Viro
2026-03-16 6:56 ` Al Viro
0 siblings, 1 reply; 42+ messages in thread
From: Al Viro @ 2026-02-26 7:29 UTC (permalink / raw)
To: Eric Zhang; +Cc: linux-sparse, dan.carpenter, chriscli, ben.dooks, rf, torvalds
On Wed, Feb 25, 2026 at 10:18:51PM +0000, Al Viro wrote:
> NOTE: substitute() is the second hottest loop in the entire thing; only
> tokenizer is hotter. And gcc is too enthusiastic about the inlining
> around that function, ending up with bad register spills, along with
> a bunch of stalls. Worse, decisions are sensitive to minor changes in
> places textually far away, making it a real bitch to deal with.
> Makes for fun reordering the commits in local queue... ;-/
FWIW, looking at that thing again, I wonder if we would be better off
with doing argument expansion on demand rather than doing it in
expand_arguments(). Should be doable with a bit of care - we'd need
to mark the TOKEN_..._ARG with several bits to decide whether we
want to duplicate or not, etc., but that's worth doing anyway -
better than playing with the counters.
Note, BTW, that collapsing TOKEN_..._ARG together, with "kind of argument"
moved into bits stolen from ->argnum improves code generation - that
switch by token type is _hot_ and it reducing the number of cases
gives a measurable speedup. Sure, we don't want heavy work at #define
time - most of the macros are never expanded at all, but AFAICS this
kind of processing can be dealt with while parsing the body, with no
extra passes needed, etc.
I'm going down right now, will look into that tomorrow morning...
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-02-26 7:29 ` Al Viro
@ 2026-03-16 6:56 ` Al Viro
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (3 more replies)
0 siblings, 4 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 6:56 UTC (permalink / raw)
To: Eric Zhang; +Cc: linux-sparse, dan.carpenter, chriscli, ben.dooks, rf, torvalds
On Thu, Feb 26, 2026 at 07:29:45AM +0000, Al Viro wrote:
> On Wed, Feb 25, 2026 at 10:18:51PM +0000, Al Viro wrote:
>
> > NOTE: substitute() is the second hottest loop in the entire thing; only
> > tokenizer is hotter. And gcc is too enthusiastic about the inlining
> > around that function, ending up with bad register spills, along with
> > a bunch of stalls. Worse, decisions are sensitive to minor changes in
> > places textually far away, making it a real bitch to deal with.
> > Makes for fun reordering the commits in local queue... ;-/
>
> FWIW, looking at that thing again, I wonder if we would be better off
> with doing argument expansion on demand rather than doing it in
> expand_arguments(). Should be doable with a bit of care - we'd need
> to mark the TOKEN_..._ARG with several bits to decide whether we
> want to duplicate or not, etc., but that's worth doing anyway -
> better than playing with the counters.
>
> Note, BTW, that collapsing TOKEN_..._ARG together, with "kind of argument"
> moved into bits stolen from ->argnum improves code generation - that
> switch by token type is _hot_ and it reducing the number of cases
> gives a measurable speedup. Sure, we don't want heavy work at #define
> time - most of the macros are never expanded at all, but AFAICS this
> kind of processing can be dealt with while parsing the body, with no
> extra passes needed, etc.
>
> I'm going down right now, will look into that tomorrow morning...
That turned out to be trickier than I hoped, but I've got something that
works.
See git://git.kernel.org/pub/scm/linux/kernel/git/viro/sparse.git #va_opt
(or individual patches in followups)
__VA_OPT__ supported, AFAICS behaviour matches C23.
* expansion and stringifying of arguments is full-lazy now -
done on demand and at most once.
* va-opt-replacement parsed at #define time, handled correctly
by dump_macro() (i.e. -dM), comparisons when redefining and at expansion
time.
* arglist mangling is gone, so's the argcount kludge.
* it's no slower than it used to be prior to that series.
I have local followups (tentative fixes for whitespace handling in preprocessor
and optimizations in tokenizer), but let's deal with that one first.
Shortlog:
Al Viro (21):
split copy() into "need to copy" and "can move in place" cases
expand and simplify the call of dup_token() in copy()
more dup_token() optimizations
parsing #define: saner handling of argument count, part 1
simplify collect_arguments() and fix error handling there
try_arg(): don't use arglist for argument name lookups
make expand_has_...() responsible for expanding its argument
preparing to change argument number encoding for TOKEN_..._ARGUMENT
steal 2 bits from argnum for argument kind
on-demand argument expansion
kill create_arglist()
stop mangling arglist, get rid of TOKEN_ARG_COUNT
deal with ## on arguments separately
preparations for __VA_OPT__ support: reshuffle argument slot assignments
pre-process.c: split try_arg()
__VA_OPT__: parsing
expansion-time va_opt handling
merge(): saner handling of ->noexpand
simplify the calling conventions of collect_arguments()
make expand_one_symbol() inline
substitute(): convert switch() into cascade of ifs
Diffstat:
ident-list.h | 1 +
pre-process.c | 929 +++++++++++++++++-----------
symbol.h | 1 +
token.h | 32 +-
tokenize.c | 4 -
validation/preprocessor/bad-args.c | 18 +
validation/preprocessor/dump-macro.c | 13 +
validation/preprocessor/has-attribute.c | 3 +
validation/preprocessor/has-builtin.c | 3 +
validation/preprocessor/va_opt.c | 54 ++
validation/preprocessor/va_opt2.c | 34 +
validation/preprocessor/va_opt_compare.c | 28 +
validation/preprocessor/va_opt_parse.c | 37 ++
validation/preprocessor/va_opt_whitespace.c | 14 +
14 files changed, 797 insertions(+), 374 deletions(-)
create mode 100644 validation/preprocessor/bad-args.c
create mode 100644 validation/preprocessor/dump-macro.c
create mode 100644 validation/preprocessor/va_opt.c
create mode 100644 validation/preprocessor/va_opt2.c
create mode 100644 validation/preprocessor/va_opt_compare.c
create mode 100644 validation/preprocessor/va_opt_parse.c
create mode 100644 validation/preprocessor/va_opt_whitespace.c
PS: as for the interesting uses of __VA_OPT__, consider this:
; cat >test.c <<'EOF'
// based on a fun trick from David Mazières
// see https://www.scs.stanford.edu/~dm/blog/va-opt.html for the entire story
// No, it's not unbounded recursion - up to 256 (4^4) elements in __VA_ARGS__;
// more with trivial modifications, just add more levels to EXPAND...
#define PARENS ()
#define EXPAND(...) EXPAND4(EXPAND4(EXPAND4(EXPAND4(__VA_ARGS__))))
#define EXPAND4(...) EXPAND3(EXPAND3(EXPAND3(EXPAND3(__VA_ARGS__))))
#define EXPAND3(...) EXPAND2(EXPAND2(EXPAND2(EXPAND2(__VA_ARGS__))))
#define EXPAND2(...) EXPAND1(EXPAND1(EXPAND1(EXPAND1(__VA_ARGS__))))
#define EXPAND1(...) __VA_ARGS__
#define FOR_EACH_PAIR(macro, ...) \
__VA_OPT__(EXPAND(FOR_EACH_PAIR_HELPER(macro, __VA_ARGS__)))
#define FOR_EACH_PAIR_HELPER(macro, a1, a2, ...) \
macro(a1, a2) \
__VA_OPT__(FOR_EACH_PAIR_AGAIN PARENS (macro, __VA_ARGS__))
#define FOR_EACH_PAIR_AGAIN() FOR_EACH_PAIR_HELPER
FOR_EACH_PAIR(F, t1, id1, t2, id2, t3, id3, t4, id4, t5, id5, t6, id6)
EOF
; cpp -E test.c
# 0 "test.c"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "test.c"
# 18 "test.c"
F(t1, id1) F(t2, id2) F(t3, id3) F(t4, id4) F(t5, id5) F(t6, id6)
;
and the same output from sparse, modulo the # ... lines - sparse -E doesn't
produce those. Our (fairly brittle) analogue is __MAP in linux/syscalls.h
and if nothing else, unlike __MAP() this thing does not need the number
of pairs passed as explicit argument. Would be interesting to try unifying
SYSCALL0..SYSCALL6 into a single macro that would bloody well _count_ the
arguments...
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases
2026-03-16 6:56 ` Al Viro
@ 2026-03-16 7:03 ` Al Viro
2026-03-16 7:03 ` [PATCH 02/21] expand and simplify the call of dup_token() in copy() Al Viro
` (19 more replies)
2026-03-16 16:42 ` [RFC PATCH] pre-process: add __VA_OPT__ support Linus Torvalds
` (2 subsequent siblings)
3 siblings, 20 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:03 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
For one thing, rechecking the flag on each iteration of a loop is rather
silly, especially when there's very little in common between the "copy"
and "move" cases and the loop is pretty hot.
For another, it's better to have the counter-related logics lifted
into substitute().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index 4e322855..ae493dc2 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -598,15 +598,23 @@ static struct token *dup_token(const struct token *token, struct position *strea
return alloc;
}
-static struct token **copy(struct token **where, struct token *list, int *count)
+static struct token **move_into(struct token **where, struct token *list)
+{
+ *where = list;
+ while (!eof_token(list)) {
+ if (token_type(list) == TOKEN_IDENT && list->ident->tainted)
+ list->pos.noexpand = 1;
+ where = &list->next;
+ list = *where;
+ }
+ return where;
+}
+
+static struct token **copy(struct token **where, struct token *list)
{
- int need_copy = --*count;
while (!eof_token(list)) {
struct token *token;
- if (need_copy)
- token = dup_token(list, &list->pos);
- else
- token = list;
+ token = dup_token(list, &list->pos);
if (token_type(token) == TOKEN_IDENT && token->ident->tainted)
token->pos.noexpand = 1;
*where = token;
@@ -698,7 +706,10 @@ static struct token **substitute(struct token **list, const struct token *body,
continue;
}
copy_arg:
- tail = copy(&added, arg, count);
+ if (!--*count)
+ tail = move_into(&added, arg);
+ else
+ tail = copy(&added, arg);
added->pos.newline = body->pos.newline;
added->pos.whitespace = body->pos.whitespace;
break;
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 02/21] expand and simplify the call of dup_token() in copy()
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
@ 2026-03-16 7:03 ` Al Viro
2026-03-16 7:03 ` [PATCH 03/21] more dup_token() optimizations Al Viro
` (18 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:03 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
reducing ->pos handling helps there
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index ae493dc2..e43061b2 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -613,10 +613,15 @@ static struct token **move_into(struct token **where, struct token *list)
static struct token **copy(struct token **where, struct token *list)
{
while (!eof_token(list)) {
- struct token *token;
- token = dup_token(list, &list->pos);
- if (token_type(token) == TOKEN_IDENT && token->ident->tainted)
- token->pos.noexpand = 1;
+ struct position pos = list->pos;
+ struct token *token = __alloc_token(0);
+
+ token->ident = list->ident;
+ if (pos.type == TOKEN_STRING || pos.type == TOKEN_WIDE_STRING)
+ list->string->immutable = 1;
+ if (pos.type == TOKEN_IDENT && list->ident->tainted)
+ pos.noexpand = 1;
+ token->pos = pos;
*where = token;
where = &token->next;
list = list->next;
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 03/21] more dup_token() optimizations
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
2026-03-16 7:03 ` [PATCH 02/21] expand and simplify the call of dup_token() in copy() Al Viro
@ 2026-03-16 7:03 ` Al Viro
2026-03-16 7:03 ` [PATCH 04/21] parsing #define: saner handling of argument count, part 1 Al Viro
` (17 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:03 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
pull handling to tainted identifiers into dup_token(), eliminating reread
and resulting stall in there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index e43061b2..320a5247 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -583,18 +583,21 @@ static int merge(struct token *left, struct token *right)
return 0;
}
-static struct token *dup_token(const struct token *token, struct position *streampos)
+static inline struct token *dup_token(const struct token *token, struct position *streampos)
{
struct position pos = *streampos;
struct token *alloc = __alloc_token(0);
+ struct position pos2 = token->pos;
- alloc->pos = token->pos;
- alloc->number = token->number;
- alloc->pos.stream = pos.stream;
- alloc->pos.line = pos.line;
- alloc->pos.pos = pos.pos;
- if (token_type(alloc) == TOKEN_STRING || token_type(alloc) == TOKEN_WIDE_STRING)
+ alloc->ident = token->ident;
+ pos2.stream = pos.stream;
+ pos2.line = pos.line;
+ pos2.pos = pos.pos;
+ if (pos2.type == TOKEN_STRING || pos2.type == TOKEN_WIDE_STRING)
token->string->immutable = 1;
+ if (pos2.type == TOKEN_IDENT && token->ident->tainted)
+ pos2.noexpand = 1;
+ alloc->pos = pos2;
return alloc;
}
@@ -728,9 +731,6 @@ static struct token **substitute(struct token **list, const struct token *body,
default:
added = dup_token(body, base_pos);
- if (token_type(body) == TOKEN_IDENT &&
- added->ident->tainted)
- added->pos.noexpand = 1;
tail = &added->next;
break;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 04/21] parsing #define: saner handling of argument count, part 1
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
2026-03-16 7:03 ` [PATCH 02/21] expand and simplify the call of dup_token() in copy() Al Viro
2026-03-16 7:03 ` [PATCH 03/21] more dup_token() optimizations Al Viro
@ 2026-03-16 7:03 ` Al Viro
2026-03-16 7:03 ` [PATCH 05/21] simplify collect_arguments() and fix error handling there Al Viro
` (16 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:03 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Mangling arglist the way we do is a bad kludge and it limits what we
can do both at parsing a macro definition and at expansion time.
We use it to
* store the number of arguments (gets stashed in the cannibalized
token of opening parenthesis)
* store the number of times each argument is used expanded,
unexpanded and stringified (stashed in cannibalized token of comma or
closing parenthesis that follows an argument)
* mark the vararg argument (ditto)
Total number of arguments would be better off in struct symbol, next
to arglist. As the matter of fact, we'd be better off with number of
non-vararg arguments and "is there a vararg" stored separately.
Number of times each argument occurs expanded, etc. is used to find if
given occurrence of argument in the body is the last one of given sort
- by counting down as we process the body during expansion, no less.
Each counter runs down at the same token of the body every time we expand
the macro, and we can just as easily mark those tokens when we parse the
definition. It is also used to tell whether we need to expand and/or
stringify the argument in the first place. Again, easily expressed
as marking the tokens and we can easily steal bits for TOKEN_..._ARG
payload - we have a 32bit value that represents the argument's number.
"Is it a vararg argument" flag is used both at definition parsing time
(when we would be better off with "the index of vararg argument or -1
if there's none") and at expansion time, when we collect the arguments.
There we pass those values to collect_argument(), telling it whether it
should stop on (unprotected) commas. The current logics is seriously
convoluted, especially around the error recovery. Untangling that ends
up with a variant that wants to know the number of non-vararg arguments
along with "do we have a vararg at all" flag, upfront and not scattered
through the arglist.
As the first step, introduce sym->fixed_args and sym->vararg and have
them calculated when we parse a macro definition; stop storing the number
of arguments in the first token of arglist.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 64 +++++++++++++++++++++++++++++++++++----------------
symbol.h | 1 +
2 files changed, 45 insertions(+), 20 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index 320a5247..d591a183 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -312,9 +312,10 @@ struct arg {
int n_str;
};
-static int collect_arguments(struct token *start, struct token *arglist, struct arg *args, struct token *what)
+static int collect_arguments(struct token *start, struct symbol *sym, struct arg *args, struct token *what)
{
- int wanted = arglist->count.normal;
+ struct token *arglist = sym->arglist;
+ int wanted = sym->fixed_args + sym->vararg;
struct token *next = NULL;
int count = 0;
@@ -760,7 +761,7 @@ static int expand(struct token **list, struct symbol *sym)
struct ident *expanding = token->ident;
struct token **tail;
struct token *expansion = sym->expansion;
- int nargs = sym->arglist ? sym->arglist->count.normal : 0;
+ int nargs = sym->fixed_args + sym->vararg;
struct arg args[nargs];
if (expanding->tainted) {
@@ -771,7 +772,7 @@ static int expand(struct token **list, struct symbol *sym)
if (sym->arglist) {
if (!match_op(scan_next(&token->next), '('))
return 1;
- if (!collect_arguments(token->next, sym->arglist, args, token))
+ if (!collect_arguments(token->next, sym, args, token))
return 1;
expand_arguments(nargs, args);
}
@@ -1087,6 +1088,21 @@ static int token_list_different(struct token *list1, struct token *list2)
}
}
+static int macro_nargs = 0;
+static int macro_vararg = -1;
+static bool macro_funclike = false;
+
+static bool macro_add_arg(struct position pos, struct ident *ident)
+{
+ if (macro_nargs == 1024)
+ goto Eargs;
+ macro_nargs++;
+ return true;
+Eargs:
+ sparse_error(pos, "too many arguments in macro definition");
+ return false;
+}
+
static inline void set_arg_count(struct token *token)
{
token_type(token) = TOKEN_ARG_COUNT;
@@ -1097,7 +1113,6 @@ static inline void set_arg_count(struct token *token)
static struct token *parse_arguments(struct token *list)
{
struct token *arg = list->next, *next = list;
- struct argcount *count = &list->count;
set_arg_count(list);
@@ -1110,10 +1125,10 @@ static struct token *parse_arguments(struct token *list)
while (token_type(arg) == TOKEN_IDENT) {
if (arg->ident == &__VA_ARGS___ident)
goto Eva_args;
- if (!++count->normal)
- goto Eargs;
- next = arg->next;
+ if (!macro_add_arg(arg->pos, arg->ident))
+ return NULL;
+ next = arg->next;
if (match_op(next, ',')) {
set_arg_count(next);
arg = next->next;
@@ -1132,6 +1147,7 @@ static struct token *parse_arguments(struct token *list)
if (match_op(next, SPECIAL_ELLIPSIS)) {
if (match_op(next->next, ')')) {
set_arg_count(next);
+ macro_vararg = macro_nargs - 1;
next->count.vararg = 1;
next = next->next;
arg->next->next = &eof_token_entry;
@@ -1156,9 +1172,10 @@ static struct token *parse_arguments(struct token *list)
arg->ident = &__VA_ARGS___ident;
if (!match_op(next, ')'))
goto Enotclosed;
- if (!++count->normal)
- goto Eargs;
+ if (!macro_add_arg(arg->pos, &__VA_ARGS___ident))
+ return NULL;
set_arg_count(next);
+ macro_vararg = macro_nargs - 1;
next->count.vararg = 1;
next = next->next;
arg->next->next = &eof_token_entry;
@@ -1188,9 +1205,6 @@ Enotclosed:
Eva_args:
sparse_error(arg->pos, "__VA_ARGS__ can only appear in the expansion of a C99 variadic macro");
return NULL;
-Eargs:
- sparse_error(arg->pos, "too many arguments in macro definition");
- return NULL;
}
static int try_arg(struct token *token, enum token_type type, struct token *arglist)
@@ -1198,7 +1212,7 @@ static int try_arg(struct token *token, enum token_type type, struct token *argl
struct ident *ident = token->ident;
int nr;
- if (!arglist || token_type(token) != TOKEN_IDENT)
+ if (!macro_funclike || token_type(token) != TOKEN_IDENT)
return 0;
arglist = arglist->next;
@@ -1221,7 +1235,7 @@ static int try_arg(struct token *token, enum token_type type, struct token *argl
n = ++count->str;
}
if (n)
- return count->vararg ? 2 : 1;
+ return nr == macro_vararg ? 2 : 1;
/*
* XXX - need saner handling of that
* (>= 1024 instances of argument)
@@ -1236,7 +1250,7 @@ static int try_arg(struct token *token, enum token_type type, struct token *argl
static struct token *handle_hash(struct token **p, struct token *arglist)
{
struct token *token = *p;
- if (arglist) {
+ if (macro_funclike) {
struct token *next = token->next;
if (!try_arg(next, TOKEN_STR_ARGUMENT, arglist))
goto Equote;
@@ -1354,7 +1368,7 @@ static int do_define(struct position pos, struct token *token, struct ident *nam
expansion = parse_expansion(expansion, arglist, name);
if (!expansion)
- return 1;
+ goto out;
sym = lookup_symbol(name, NS_MACRO | NS_UNDEF);
if (sym) {
@@ -1388,6 +1402,8 @@ static int do_define(struct position pos, struct token *token, struct ident *nam
if (!ret) {
sym->expansion = expansion;
sym->arglist = arglist;
+ sym->vararg = macro_vararg >= 0;
+ sym->fixed_args = macro_nargs - sym->vararg;
if (token) /* Free the "define" token, but not the rest of the line */
__free_token(token);
}
@@ -1396,6 +1412,9 @@ static int do_define(struct position pos, struct token *token, struct ident *nam
sym->used_in = NULL;
sym->attr = attr;
out:
+ macro_nargs = 0;
+ macro_vararg = -1;
+ macro_funclike = false;
return ret;
}
@@ -1490,8 +1509,12 @@ static int do_handle_define(struct stream *stream, struct token **line, struct t
if (match_op(expansion, '(')) {
arglist = expansion;
expansion = parse_arguments(expansion);
- if (!expansion)
+ if (!expansion) {
+ macro_nargs = 0;
+ macro_vararg = -1;
return 1;
+ }
+ macro_funclike = true;
} else if (!eof_token(expansion)) {
warning(expansion->pos,
"no whitespace before object-like macro body");
@@ -2075,8 +2098,9 @@ static void create_arglist(struct symbol *sym, int count)
token = __alloc_token(0);
token_type(token) = TOKEN_ARG_COUNT;
- token->count.normal = count;
sym->arglist = token;
+ sym->fixed_args = count;
+ sym->vararg = 0;
next = &token->next;
while (count--) {
@@ -2300,7 +2324,7 @@ static int is_VA_ARGS_token(struct token *token)
static void dump_macro(struct symbol *sym)
{
- int nargs = sym->arglist ? sym->arglist->count.normal : 0;
+ int nargs = sym->fixed_args + sym->vararg;
struct token *args[nargs];
struct token *token;
diff --git a/symbol.h b/symbol.h
index 3552d439..026dab6f 100644
--- a/symbol.h
+++ b/symbol.h
@@ -168,6 +168,7 @@ struct symbol {
struct scope *used_in;
void (*expand_simple)(struct token *);
bool (*expand)(struct token *, struct arg *args);
+ int fixed_args, vararg;
};
struct /* NS_PREPROCESSOR */ {
int (*handler)(struct stream *, struct token **, struct token *);
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 05/21] simplify collect_arguments() and fix error handling there
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (2 preceding siblings ...)
2026-03-16 7:03 ` [PATCH 04/21] parsing #define: saner handling of argument count, part 1 Al Viro
@ 2026-03-16 7:03 ` Al Viro
2026-03-16 7:04 ` [PATCH 06/21] try_arg(): don't use arglist for argument name lookups Al Viro
` (15 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:03 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
The current logics is too convoluted. We use collect_arg() to carve
an argument out; it takes a pointer to previous token (either opening
parenthesis or comma), finds how far does the argument extend, cuts the
list at its end and returns the token that follows it (normally either
closing parenthesis or the comma). collect_arg() is told whether we
want a vararg argument or not - the difference is that normal arguments
terminate on commas.
When macro has N non-vararg arguments and V (0 or 1) vararg ones, we want
* N calls of collect_arg() asking for non-vararg arguments;
all but the last one must be followed by commas. The last one may be
followed either by comma or by closing parenthesis.
* If we have seen exactly N commas, call collect_arg() asking
to collect everything until the closing parenthesis. That will get us
to the end of arguments.
The only potential gotcha is that there is a case when "exactly N commas"
for non-vararg macro does _not_ mean excessive arguments - N = V = 0.
Not hard to account for - in that case we must look at the chunk carved
out by the last (and only) call of collect_arg(); that would be everything
between the parentheses. If it's empty, we are fine, otherwise we've
excessive arguments.
Rather than trying to fold all of that into a single loop, separate
the handling of non-vararg arguments from the rest; the logics becomes
simpler that way, especially around the error recovery.
As a side benefit the 'vararg' bit in struct argcount becomes unused
and can be removed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 104 ++++++++++++++++++++++----------------------------
token.h | 1 -
2 files changed, 46 insertions(+), 59 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index d591a183..25990dfa 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -255,7 +255,7 @@ static void expand_list(struct token **list)
static void preprocessor_line(struct stream *stream, struct token **line);
-static struct token *collect_arg(struct token *prev, int vararg, const struct position *pos)
+static struct token *collect_arg(struct token *prev, bool vararg, const struct position *pos)
{
struct stream *stream = input_streams + prev->pos.stream;
struct token **p = &prev->next;
@@ -314,76 +314,66 @@ struct arg {
static int collect_arguments(struct token *start, struct symbol *sym, struct arg *args, struct token *what)
{
+ int fixed = sym->fixed_args;
+ bool vararg = sym->vararg;
struct token *arglist = sym->arglist;
- int wanted = sym->fixed_args + sym->vararg;
- struct token *next = NULL;
- int count = 0;
+ struct argcount *p;
+ struct token *next = NULL, *v = NULL;
+ const char *err;
+ int commas;
arglist = arglist->next; /* skip counter */
- if (!wanted) {
- next = collect_arg(start, 0, &what->pos);
- if (eof_token(next))
+ for (commas = 0; commas < fixed; commas++) {
+ next = collect_arg(start, false, &what->pos);
+ if (token_type(next) != TOKEN_SPECIAL)
goto Eclosing;
- if (!eof_token(start->next) || !match_op(next, ')')) {
- count++;
- goto Emany;
- }
- } else {
- for (count = 0; count < wanted; count++) {
- struct argcount *p = &arglist->next->count;
- next = collect_arg(start, p->vararg, &what->pos);
- if (eof_token(next))
- goto Eclosing;
- if (p->vararg && wanted == 1 && eof_token(start->next))
- break;
- arglist = arglist->next->next;
- args[count].arg = start->next;
- args[count].n_normal = p->normal;
- args[count].n_quoted = p->quoted;
- args[count].n_str = p->str;
- if (match_op(next, ')')) {
- count++;
- break;
- }
- start = next;
- }
- if (count == wanted && !match_op(next, ')'))
- goto Emany;
- if (count == wanted - 1) {
- struct argcount *p = &arglist->next->count;
- if (!p->vararg)
+ p = &arglist->next->count;
+ arglist = arglist->next->next;
+ args[commas].arg = start->next;
+ args[commas].n_normal = p->normal;
+ args[commas].n_quoted = p->quoted;
+ args[commas].n_str = p->str;
+ if (!match_op(next, ',')) {
+ if (commas < fixed - 1)
goto Efew;
- args[count].arg = NULL;
- args[count].n_normal = p->normal;
- args[count].n_quoted = p->quoted;
- args[count].n_str = p->str;
+ break;
}
- if (count < wanted - 1)
- goto Efew;
+ start = next;
+ }
+ if (commas == fixed) {
+ next = collect_arg(start, true, &what->pos);
+ if (token_type(next) != TOKEN_SPECIAL)
+ goto Eclosing;
+ v = start->next;
+ if (fixed == 0 && eof_token(v))
+ v = NULL;
+ }
+ if (v && !vararg)
+ goto Eexcess;
+ if (vararg) {
+ p = &arglist->next->count;
+ args[fixed].arg = v;
+ args[fixed].n_normal = p->normal;
+ args[fixed].n_quoted = p->quoted;
+ args[fixed].n_str = p->str;
}
what->next = next->next;
return 1;
Efew:
- sparse_error(what->pos, "macro \"%s\" requires %d arguments, but only %d given",
- show_token(what), wanted, count);
+ err = "too few arguments provided to";
+ next = next->next;
goto out;
-Emany:
- while (match_op(next, ',')) {
- next = collect_arg(next, 0, &what->pos);
- count++;
- }
- if (eof_token(next))
- goto Eclosing;
- sparse_error(what->pos, "macro \"%s\" passed %d arguments, but takes just %d",
- show_token(what), count, wanted);
+Eexcess:
+ err = "too many arguments provided to";
+ next = next->next;
goto out;
Eclosing:
- sparse_error(what->pos, "unterminated argument list invoking macro \"%s\"",
- show_token(what));
+ err = "unterminated argument list invoking";
out:
- what->next = next->next;
+ sparse_error(what->pos, "%s macro \"%s\"", err, show_ident(sym->ident));
+ what->next = next;
return 0;
}
@@ -1107,7 +1097,7 @@ static inline void set_arg_count(struct token *token)
{
token_type(token) = TOKEN_ARG_COUNT;
token->count.normal = token->count.quoted =
- token->count.str = token->count.vararg = 0;
+ token->count.str = 0;
}
static struct token *parse_arguments(struct token *list)
@@ -1148,7 +1138,6 @@ static struct token *parse_arguments(struct token *list)
if (match_op(next->next, ')')) {
set_arg_count(next);
macro_vararg = macro_nargs - 1;
- next->count.vararg = 1;
next = next->next;
arg->next->next = &eof_token_entry;
return next->next;
@@ -1176,7 +1165,6 @@ static struct token *parse_arguments(struct token *list)
return NULL;
set_arg_count(next);
macro_vararg = macro_nargs - 1;
- next->count.vararg = 1;
next = next->next;
arg->next->next = &eof_token_entry;
return next;
diff --git a/token.h b/token.h
index 9000e0cb..5dcd8594 100644
--- a/token.h
+++ b/token.h
@@ -175,7 +175,6 @@ struct argcount {
unsigned normal:10;
unsigned quoted:10;
unsigned str:10;
- unsigned vararg:1;
};
/*
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 06/21] try_arg(): don't use arglist for argument name lookups
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (3 preceding siblings ...)
2026-03-16 7:03 ` [PATCH 05/21] simplify collect_arguments() and fix error handling there Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 07/21] make expand_has_...() responsible for expanding its argument Al Viro
` (14 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Just store them into a global array and search there. That allows to
get rid of mangling ... in the arglist along with the is_VA_ARGS_token()
kludge. For now we still need to access the arglist in try_arg(),
but that's going away as soon as we get rid of the use counters...
Added a check for duplicate argument names, while we are at it - we
didn't do that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 90 ++++++++++++++--------------
validation/preprocessor/bad-args.c | 18 ++++++
validation/preprocessor/dump-macro.c | 9 +++
3 files changed, 72 insertions(+), 45 deletions(-)
create mode 100644 validation/preprocessor/bad-args.c
create mode 100644 validation/preprocessor/dump-macro.c
diff --git a/pre-process.c b/pre-process.c
index 25990dfa..17ed7f85 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -1078,16 +1078,24 @@ static int token_list_different(struct token *list1, struct token *list2)
}
}
+static struct ident *macro_arg_name[1024];
static int macro_nargs = 0;
static int macro_vararg = -1;
static bool macro_funclike = false;
static bool macro_add_arg(struct position pos, struct ident *ident)
{
+ for (int i = 0; i < macro_nargs; i++) {
+ if (ident == macro_arg_name[i])
+ goto Edup_arg;
+ }
if (macro_nargs == 1024)
goto Eargs;
- macro_nargs++;
+ macro_arg_name[macro_nargs++] = ident;
return true;
+Edup_arg:
+ sparse_error(pos, "duplicate macro parameter \"%s\"", show_ident(ident));
+ return false;
Eargs:
sparse_error(pos, "too many arguments in macro definition");
return false;
@@ -1157,8 +1165,6 @@ static struct token *parse_arguments(struct token *list)
if (match_op(arg, SPECIAL_ELLIPSIS)) {
next = arg->next;
- token_type(arg) = TOKEN_IDENT;
- arg->ident = &__VA_ARGS___ident;
if (!match_op(next, ')'))
goto Enotclosed;
if (!macro_add_arg(arg->pos, &__VA_ARGS___ident))
@@ -1198,41 +1204,41 @@ Eva_args:
static int try_arg(struct token *token, enum token_type type, struct token *arglist)
{
struct ident *ident = token->ident;
- int nr;
+ int nr, n;
if (!macro_funclike || token_type(token) != TOKEN_IDENT)
return 0;
- arglist = arglist->next;
+ for (nr = 0; nr < macro_nargs && macro_arg_name[nr] != ident; nr++)
+ ;
- for (nr = 0; !eof_token(arglist); nr++, arglist = arglist->next->next) {
- if (arglist->ident == ident) {
- struct argcount *count = &arglist->next->count;
- int n;
+ if (nr == macro_nargs)
+ return 0;
- token->argnum = nr;
- token_type(token) = type;
- switch (type) {
- case TOKEN_MACRO_ARGUMENT:
- n = ++count->normal;
- break;
- case TOKEN_QUOTED_ARGUMENT:
- n = ++count->quoted;
- break;
- default:
- n = ++count->str;
- }
- if (n)
- return nr == macro_vararg ? 2 : 1;
- /*
- * XXX - need saner handling of that
- * (>= 1024 instances of argument)
- */
- token_type(token) = TOKEN_ERROR;
- return -1;
- }
+ arglist = arglist->next;
+ for (int i = 0; i < nr; i++)
+ arglist = arglist->next->next;
+
+ token->argnum = nr;
+ token_type(token) = type;
+ switch (type) {
+ case TOKEN_MACRO_ARGUMENT:
+ n = ++arglist->next->count.normal;
+ break;
+ case TOKEN_QUOTED_ARGUMENT:
+ n = ++arglist->next->count.quoted;
+ break;
+ default:
+ n = ++arglist->next->count.str;
}
- return 0;
+ if (n)
+ return nr == macro_vararg ? 2 : 1;
+ /*
+ * XXX - need saner handling of that
+ * (>= 1024 instances of argument)
+ */
+ token_type(token) = TOKEN_ERROR;
+ return -1;
}
static struct token *handle_hash(struct token **p, struct token *arglist)
@@ -2304,16 +2310,10 @@ struct token * preprocess(struct token *token)
return token;
}
-static int is_VA_ARGS_token(struct token *token)
-{
- return (token_type(token) == TOKEN_IDENT) &&
- (token->ident == &__VA_ARGS___ident);
-}
-
static void dump_macro(struct symbol *sym)
{
int nargs = sym->fixed_args + sym->vararg;
- struct token *args[nargs];
+ struct ident *args[nargs];
struct token *token;
printf("#define %s", show_ident(sym->ident));
@@ -2325,13 +2325,13 @@ static void dump_macro(struct symbol *sym)
for (; !eof_token(token); token = token->next) {
if (token_type(token) == TOKEN_ARG_COUNT)
continue;
- if (is_VA_ARGS_token(token))
- printf("%s...", sep);
- else
- printf("%s%s", sep, show_token(token));
- args[narg++] = token;
+ printf("%s%s", sep, show_token(token));
+ if (token_type(token) == TOKEN_IDENT)
+ args[narg++] = token->ident;
sep = ",";
}
+ if (narg < nargs)
+ args[narg] = &__VA_ARGS___ident;
putchar(')');
}
@@ -2349,8 +2349,8 @@ static void dump_macro(struct symbol *sym)
/* fall-through */
case TOKEN_QUOTED_ARGUMENT:
case TOKEN_MACRO_ARGUMENT:
- token = args[token->argnum];
- /* fall-through */
+ printf("%s", show_ident(args[token->argnum]));
+ break;
default:
printf("%s", show_token(token));
}
diff --git a/validation/preprocessor/bad-args.c b/validation/preprocessor/bad-args.c
new file mode 100644
index 00000000..3dbb6f92
--- /dev/null
+++ b/validation/preprocessor/bad-args.c
@@ -0,0 +1,18 @@
+#define A(1)
+#define B(__VA_ARGS__)
+#define C(X,Y,X)
+/*
+ * check-name: macro arguments validation
+ * check-command: sparse -E $file
+ *
+ * check-output-start
+
+
+ * check-output-end
+ *
+ * check-error-start
+preprocessor/bad-args.c:1:11: error: "1" may not appear in macro parameter list
+preprocessor/bad-args.c:2:11: error: __VA_ARGS__ can only appear in the expansion of a C99 variadic macro
+preprocessor/bad-args.c:3:15: error: duplicate macro parameter "X"
+ * check-error-end
+ */
diff --git a/validation/preprocessor/dump-macro.c b/validation/preprocessor/dump-macro.c
new file mode 100644
index 00000000..46d70b34
--- /dev/null
+++ b/validation/preprocessor/dump-macro.c
@@ -0,0 +1,9 @@
+#define A(X,Y,...) __VA_ARGS__,Y,X
+/*
+ * check-name: -dM handling of varargs
+ * check-command: sparse -E -dM $file | tail -1
+ *
+ * check-output-start
+#define A(X,Y,...) __VA_ARGS__,Y,X
+ * check-output-end
+ */
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 07/21] make expand_has_...() responsible for expanding its argument
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (4 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 06/21] try_arg(): don't use arglist for argument name lookups Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 08/21] preparing to change argument number encoding for TOKEN_..._ARGUMENT Al Viro
` (13 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
If we want to make expansion of arguments on-demand, we need to adjust
->expand() calling conventions first, passing it unexpanded arguments.
Switch create_arglist() to setting argcounts from normal=1 to quoted=1,
provide a helper (first_arg()) that does actual expansion and make
->expand() instances use it. After that these fake arglists are used
only for two things: they indicate that these macros are function-like
and they short-circuit expand_arguments(). Once we switch to on-demand
argument expansion, the second role will disappear and we can just use
&eof_token_entry as ->arglist for those.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 17 ++++++++++++-----
validation/preprocessor/has-attribute.c | 3 +++
validation/preprocessor/has-builtin.c | 3 +++
3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index 17ed7f85..85662365 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -2000,9 +2000,16 @@ static int handle_nondirective(struct stream *stream, struct token **line, struc
return 1;
}
+static struct token *first_arg(struct arg *args)
+{
+ struct token *arg = args[0].arg;
+ expand_list(&arg);
+ return arg;
+}
+
static bool expand_has_attribute(struct token *token, struct arg *args)
{
- struct token *arg = args[0].expanded;
+ struct token *arg = first_arg(args);
struct symbol *sym;
if (token_type(arg) != TOKEN_IDENT) {
@@ -2017,7 +2024,7 @@ static bool expand_has_attribute(struct token *token, struct arg *args)
static bool expand_has_builtin(struct token *token, struct arg *args)
{
- struct token *arg = args[0].expanded;
+ struct token *arg = first_arg(args);
struct symbol *sym;
if (token_type(arg) != TOKEN_IDENT) {
@@ -2032,7 +2039,7 @@ static bool expand_has_builtin(struct token *token, struct arg *args)
static bool expand_has_extension(struct token *token, struct arg *args)
{
- struct token *arg = args[0].expanded;
+ struct token *arg = first_arg(args);
struct ident *ident;
bool val = false;
@@ -2057,7 +2064,7 @@ static bool expand_has_extension(struct token *token, struct arg *args)
static bool expand_has_feature(struct token *token, struct arg *args)
{
- struct token *arg = args[0].expanded;
+ struct token *arg = first_arg(args);
struct ident *ident;
bool val = false;
@@ -2103,7 +2110,7 @@ static void create_arglist(struct symbol *sym, int count)
token_type(id) = TOKEN_IDENT;
uses = __alloc_token(0);
token_type(uses) = TOKEN_ARG_COUNT;
- uses->count.normal = 1;
+ uses->count.quoted = 1;
*next = id;
id->next = uses;
diff --git a/validation/preprocessor/has-attribute.c b/validation/preprocessor/has-attribute.c
index 3149cbfa..dd0f275e 100644
--- a/validation/preprocessor/has-attribute.c
+++ b/validation/preprocessor/has-attribute.c
@@ -6,6 +6,8 @@ __has_attribute()??? Quesako?
#endif
123 __has_attribute(nothinx) def
+#define A packed
+456 __has_attribute(A)
#if __has_attribute(nothinx)
#error "not a attribute!"
@@ -49,6 +51,7 @@ __has_attribute()??? Quesako?
"has __has_attribute(), yeah!"
123 0 def
+456 1
"ok gcc"
"ok gcc ignore"
"ok sparse specific"
diff --git a/validation/preprocessor/has-builtin.c b/validation/preprocessor/has-builtin.c
index 03272fc9..010d44bd 100644
--- a/validation/preprocessor/has-builtin.c
+++ b/validation/preprocessor/has-builtin.c
@@ -28,6 +28,8 @@ constant_p
#endif
123 __has_builtin(abc) def
+#define A __builtin_constant_p
+456 __has_builtin(A)
/*
* check-name: has-builtin
@@ -39,5 +41,6 @@ constant_p
abs
constant_p
123 0 def
+456 1
* check-output-end
*/
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 08/21] preparing to change argument number encoding for TOKEN_..._ARGUMENT
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (5 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 07/21] make expand_has_...() responsible for expanding its argument Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 09/21] steal 2 bits from argnum for argument kind Al Viro
` (12 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
We want to steal some bits from TOKEN_..._ARGUMENT; it's not a problem,
seeing that payload is a 32bit number and we are *not* going to support
many millions of arguments in macros. For now, wrap the accesses into
an inline helper (argnum(token)), to reduce the amount of noise in
subsequent patches.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 23 ++++++++++++++---------
token.h | 4 ++++
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index 85662365..efb208e7 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -624,11 +624,16 @@ static struct token **copy(struct token **where, struct token *list)
return where;
}
+static inline int argnum(const struct token *arg)
+{
+ return arg->argnum >> ARGNUM_BITS_STOLEN;
+}
+
static int handle_kludge(const struct token **p, struct arg *args)
{
const struct token *t = (*p)->next->next;
while (1) {
- struct arg *v = &args[t->argnum];
+ struct arg *v = &args[argnum(t)];
if (token_type(t->next) != TOKEN_CONCAT) {
if (v->arg) {
/* ignore the first ## */
@@ -681,13 +686,13 @@ static struct token **substitute(struct token **list, const struct token *body,
break;
case TOKEN_STR_ARGUMENT:
- arg = args[body->argnum].str;
- count = &args[body->argnum].n_str;
+ arg = args[argnum(body)].str;
+ count = &args[argnum(body)].n_str;
goto copy_arg;
case TOKEN_QUOTED_ARGUMENT:
- arg = args[body->argnum].arg;
- count = &args[body->argnum].n_quoted;
+ arg = args[argnum(body)].arg;
+ count = &args[argnum(body)].n_quoted;
if (!arg || eof_token(arg)) {
if (state == Concat)
state = Normal;
@@ -698,8 +703,8 @@ static struct token **substitute(struct token **list, const struct token *body,
goto copy_arg;
case TOKEN_MACRO_ARGUMENT:
- arg = args[body->argnum].expanded;
- count = &args[body->argnum].n_normal;
+ arg = args[argnum(body)].expanded;
+ count = &args[argnum(body)].n_normal;
if (eof_token(arg)) {
state = Normal;
continue;
@@ -1219,7 +1224,7 @@ static int try_arg(struct token *token, enum token_type type, struct token *argl
for (int i = 0; i < nr; i++)
arglist = arglist->next->next;
- token->argnum = nr;
+ token->argnum = nr << ARGNUM_BITS_STOLEN;
token_type(token) = type;
switch (type) {
case TOKEN_MACRO_ARGUMENT:
@@ -2356,7 +2361,7 @@ static void dump_macro(struct symbol *sym)
/* fall-through */
case TOKEN_QUOTED_ARGUMENT:
case TOKEN_MACRO_ARGUMENT:
- printf("%s", show_ident(args[token->argnum]));
+ printf("%s", show_ident(args[argnum(token)]));
break;
default:
printf("%s", show_token(token));
diff --git a/token.h b/token.h
index 5dcd8594..fe7c7fe9 100644
--- a/token.h
+++ b/token.h
@@ -177,6 +177,10 @@ struct argcount {
unsigned str:10;
};
+enum {
+ ARGNUM_BITS_STOLEN
+};
+
/*
* This is a very common data structure, it should be kept
* as small as humanly possible. Big (rare) types go as
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 09/21] steal 2 bits from argnum for argument kind
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (6 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 08/21] preparing to change argument number encoding for TOKEN_..._ARGUMENT Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 10/21] on-demand argument expansion Al Viro
` (11 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
We have 3 separate token types (TOKEN_{MACRO,QUOTED,STR}_ARGUMENT),
with fairly similar handling at expansion time. Let's steal two bits
from ->argnum and use them to represent the kind of occurrence; that
simplifies substitute() and allows for better code generation there.
The object we use to store the argument state at expansion time (struct
arg) is already a structure with 3 pointers to token lists (unexpanded,
expanded and stringified forms of the argument) and 3 integer counters -
the number of remaining occurrencies of each kind. Gather those into
3-element arrays indexed by the kind; counts will be gone soon, token
lists will remain.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 106 +++++++++++++++++++++-----------------------------
token.h | 14 +++++--
2 files changed, 55 insertions(+), 65 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index efb208e7..b45688d5 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -304,12 +304,8 @@ static struct token *collect_arg(struct token *prev, bool vararg, const struct p
*/
struct arg {
- struct token *arg;
- struct token *expanded;
- struct token *str;
- int n_normal;
- int n_quoted;
- int n_str;
+ struct token *arg[3];
+ int count[3];
};
static int collect_arguments(struct token *start, struct symbol *sym, struct arg *args, struct token *what)
@@ -330,10 +326,10 @@ static int collect_arguments(struct token *start, struct symbol *sym, struct arg
goto Eclosing;
p = &arglist->next->count;
arglist = arglist->next->next;
- args[commas].arg = start->next;
- args[commas].n_normal = p->normal;
- args[commas].n_quoted = p->quoted;
- args[commas].n_str = p->str;
+ args[commas].arg[ARG_QUOTED] = start->next;
+ args[commas].count[ARG_NORMAL] = p->normal;
+ args[commas].count[ARG_QUOTED] = p->quoted;
+ args[commas].count[ARG_STR] = p->str;
if (!match_op(next, ',')) {
if (commas < fixed - 1)
goto Efew;
@@ -353,10 +349,10 @@ static int collect_arguments(struct token *start, struct symbol *sym, struct arg
goto Eexcess;
if (vararg) {
p = &arglist->next->count;
- args[fixed].arg = v;
- args[fixed].n_normal = p->normal;
- args[fixed].n_quoted = p->quoted;
- args[fixed].n_str = p->str;
+ args[fixed].arg[ARG_QUOTED] = v;
+ args[fixed].count[ARG_NORMAL] = p->normal;
+ args[fixed].count[ARG_QUOTED] = p->quoted;
+ args[fixed].count[ARG_STR] = p->str;
}
what->next = next->next;
return 1;
@@ -440,21 +436,21 @@ static void expand_arguments(int count, struct arg *args)
{
int i;
for (i = 0; i < count; i++) {
- struct token *arg = args[i].arg;
+ struct token *arg = args[i].arg[ARG_QUOTED];
if (!arg)
arg = &eof_token_entry;
- if (args[i].n_str)
- args[i].str = stringify(arg);
- if (args[i].n_normal) {
- if (!args[i].n_quoted) {
- args[i].expanded = arg;
- args[i].arg = NULL;
+ if (args[i].count[ARG_STR])
+ args[i].arg[ARG_STR] = stringify(arg);
+ if (args[i].count[ARG_NORMAL]) {
+ if (!args[i].count[ARG_QUOTED]) {
+ args[i].arg[ARG_NORMAL] = arg;
+ args[i].arg[ARG_QUOTED] = NULL;
} else if (eof_token(arg)) {
- args[i].expanded = arg;
+ args[i].arg[ARG_NORMAL] = arg;
} else {
- args[i].expanded = dup_list(arg);
+ args[i].arg[ARG_NORMAL] = dup_list(arg);
}
- expand_list(&args[i].expanded);
+ expand_list(&args[i].arg[ARG_NORMAL]);
}
}
}
@@ -629,13 +625,18 @@ static inline int argnum(const struct token *arg)
return arg->argnum >> ARGNUM_BITS_STOLEN;
}
+static inline enum arg_kind argkind(const struct token *arg)
+{
+ return arg->argnum & ARGNUM_KIND_MASK;
+}
+
static int handle_kludge(const struct token **p, struct arg *args)
{
const struct token *t = (*p)->next->next;
while (1) {
- struct arg *v = &args[argnum(t)];
+ struct token *v = args[argnum(t)].arg[ARG_QUOTED];
if (token_type(t->next) != TOKEN_CONCAT) {
- if (v->arg) {
+ if (v) {
/* ignore the first ## */
*p = (*p)->next;
return 0;
@@ -644,7 +645,7 @@ static int handle_kludge(const struct token **p, struct arg *args)
*p = t;
return 1;
}
- if (v->arg && !eof_token(v->arg))
+ if (v && !eof_token(v))
return 0; /* no magic */
t = t->next->next;
}
@@ -685,14 +686,9 @@ static struct token **substitute(struct token **list, const struct token *body,
tail = &added->next;
break;
- case TOKEN_STR_ARGUMENT:
- arg = args[argnum(body)].str;
- count = &args[argnum(body)].n_str;
- goto copy_arg;
-
- case TOKEN_QUOTED_ARGUMENT:
- arg = args[argnum(body)].arg;
- count = &args[argnum(body)].n_quoted;
+ case TOKEN_MACRO_ARGUMENT:
+ arg = args[argnum(body)].arg[argkind(body)];
+ count = &args[argnum(body)].count[argkind(body)];
if (!arg || eof_token(arg)) {
if (state == Concat)
state = Normal;
@@ -700,16 +696,6 @@ static struct token **substitute(struct token **list, const struct token *body,
state = Placeholder;
continue;
}
- goto copy_arg;
-
- case TOKEN_MACRO_ARGUMENT:
- arg = args[argnum(body)].expanded;
- count = &args[argnum(body)].n_normal;
- if (eof_token(arg)) {
- state = Normal;
- continue;
- }
- copy_arg:
if (!--*count)
tail = move_into(&added, arg);
else
@@ -1040,8 +1026,6 @@ static int token_different(struct token *t1, struct token *t2)
different = t1->special != t2->special;
break;
case TOKEN_MACRO_ARGUMENT:
- case TOKEN_QUOTED_ARGUMENT:
- case TOKEN_STR_ARGUMENT:
different = t1->argnum != t2->argnum;
break;
case TOKEN_CHAR_EMBEDDED_0 ... TOKEN_CHAR_EMBEDDED_3:
@@ -1206,7 +1190,7 @@ Eva_args:
return NULL;
}
-static int try_arg(struct token *token, enum token_type type, struct token *arglist)
+static int try_arg(struct token *token, enum arg_kind kind, struct token *arglist)
{
struct ident *ident = token->ident;
int nr, n;
@@ -1224,13 +1208,13 @@ static int try_arg(struct token *token, enum token_type type, struct token *argl
for (int i = 0; i < nr; i++)
arglist = arglist->next->next;
- token->argnum = nr << ARGNUM_BITS_STOLEN;
- token_type(token) = type;
- switch (type) {
- case TOKEN_MACRO_ARGUMENT:
+ token->argnum = (nr << ARGNUM_BITS_STOLEN) | kind;
+ token_type(token) = TOKEN_MACRO_ARGUMENT;
+ switch (kind) {
+ case ARG_NORMAL:
n = ++arglist->next->count.normal;
break;
- case TOKEN_QUOTED_ARGUMENT:
+ case ARG_QUOTED:
n = ++arglist->next->count.quoted;
break;
default:
@@ -1251,7 +1235,7 @@ static struct token *handle_hash(struct token **p, struct token *arglist)
struct token *token = *p;
if (macro_funclike) {
struct token *next = token->next;
- if (!try_arg(next, TOKEN_STR_ARGUMENT, arglist))
+ if (!try_arg(next, ARG_STR, arglist))
goto Equote;
next->pos.whitespace = token->pos.whitespace;
__free_token(token);
@@ -1273,7 +1257,7 @@ static struct token *handle_hashhash(struct token *token, struct token *arglist)
struct token *concat;
int state = match_op(token, ',');
- try_arg(token, TOKEN_QUOTED_ARGUMENT, arglist);
+ try_arg(token, ARG_QUOTED, arglist);
while (1) {
struct token *t;
@@ -1297,7 +1281,7 @@ static struct token *handle_hashhash(struct token *token, struct token *arglist)
return NULL;
}
- is_arg = try_arg(t, TOKEN_QUOTED_ARGUMENT, arglist);
+ is_arg = try_arg(t, ARG_QUOTED, arglist);
if (state == 1 && is_arg) {
state = is_arg;
@@ -1339,7 +1323,7 @@ static struct token *parse_expansion(struct token *expansion, struct token *argl
if (!token)
return NULL;
} else {
- try_arg(token, TOKEN_MACRO_ARGUMENT, arglist);
+ try_arg(token, ARG_NORMAL, arglist);
}
if (token_type(token) == TOKEN_ERROR)
goto Earg;
@@ -2007,7 +1991,7 @@ static int handle_nondirective(struct stream *stream, struct token **line, struc
static struct token *first_arg(struct arg *args)
{
- struct token *arg = args[0].arg;
+ struct token *arg = args[0].arg[ARG_QUOTED];
expand_list(&arg);
return arg;
}
@@ -2356,11 +2340,9 @@ static void dump_macro(struct symbol *sym)
case TOKEN_CONCAT:
printf("##");
break;
- case TOKEN_STR_ARGUMENT:
- printf("#");
- /* fall-through */
- case TOKEN_QUOTED_ARGUMENT:
case TOKEN_MACRO_ARGUMENT:
+ if (argkind(token) == ARG_STR)
+ printf("#");
printf("%s", show_ident(args[argnum(token)]));
break;
default:
diff --git a/token.h b/token.h
index fe7c7fe9..273da39a 100644
--- a/token.h
+++ b/token.h
@@ -100,8 +100,6 @@ enum token_type {
TOKEN_STREAMBEGIN,
TOKEN_STREAMEND,
TOKEN_MACRO_ARGUMENT,
- TOKEN_STR_ARGUMENT,
- TOKEN_QUOTED_ARGUMENT,
TOKEN_CONCAT,
TOKEN_GNU_KLUDGE,
TOKEN_UNTAINT,
@@ -177,8 +175,18 @@ struct argcount {
unsigned str:10;
};
+enum arg_kind {
+ ARG_QUOTED = 0,
+ ARG_NORMAL = 1,
+ ARG_STR = 2,
+};
+
+enum {
+ ARGNUM_BITS_STOLEN = 2
+};
+
enum {
- ARGNUM_BITS_STOLEN
+ ARGNUM_KIND_MASK = 3
};
/*
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 10/21] on-demand argument expansion
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (7 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 09/21] steal 2 bits from argnum for argument kind Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 11/21] kill create_arglist() Al Viro
` (10 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Instead of calculating expanded and stringified forms of arguments before
we get to interpolating them into the body, do that on demand.
There are several subtle points involved:
* array of arguments ('args' in collect_arguments() and friends)
needs to be explicitly zeroed; then we can easily see whether we'd
already expanded or stringified the sucker. Better done as an explicit
memset(), and it's better off in collect_arguments() than in expand()
itself - that way we don't need to bother with that for macros that
don't have arguments in the first place. And yes, doing that as an
empty initializer in expand() where that VLA is declared does result in
a measurable slowdown...
* anti-recursion rules do not apply to argument expansion;
something like
#define A(x) x + 1
A(A(1))
should result in x + 1 + 1, not A(1) + 1. If expansion is done before
we mark 'A' and call substitute() that happens automatically; if we
delay it until substitute() runs into the first normal instance of an
argument, we need to remove the mark for duration of that expansion.
Not a problem, fortunately, since the set of marked symbols at the time
we get to TOKEN_MACRO_ARGUMENT will be the same as it used to be at the
time we enter substitute() - arguments can't contain any TOKEN_UNTAINT
(collect_arg() would eat all of those) and any taint added during
expand_list() will come with matching TOKEN_UNTAINT inserted into the
list. All such TOKEN_UNTAINT will be consumed before expand_list()
returns, restoring the original conditions. All we need to do is to
remove the taint from macro being substituted just before expanding an
argument and restore it right after that - it will do the right thing.
It makes sense to shift setting the taint from the caller of substitute()
where it's currently done into the very beginning of substitute() itself,
while we are at it.
* instead of using counters to determine if this form of argument
is not needed after this place just mark the last place where given
form is needed when we are parsing the body - it's easy enough to do.
The only subtlety here is that unexpanded argument is needed to calculate
an expanded form, so in
#define A(x) foo_##x = x
we can't cannibalize the unexpanded form of x until the second instance of
x in the body. Rules for unexpanded form are
1) it's needed for any unexpanded occurrence (obviously)
2) it's needed for the first expanded occurrence
3) it's needed for the first stringified occurrence
Fortunately, we don't need nothing non-trivial at #define time - no
separate passes, etc. Note that in real world most of the macros seen
in given compile unit are never expanded in it, so we need to keep the
handling of #define light - trading the overhead at expansion time for
overhead at definition time is a bad idea.
* since the counters are gone, we no longer need to pass arglist
all over the place; parsing side gets an array of struct arg_state,
which is where we keep the information about argument occurrences
through the parsing. Expansion side doesn't need arglist anymore -
the last place that used to need it was collect_arguments(), and only
to copy the counters. No more of that...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 159 ++++++++++++++++++++++++--------------------------
token.h | 8 ++-
2 files changed, 82 insertions(+), 85 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index b45688d5..bd049620 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -305,31 +305,23 @@ static struct token *collect_arg(struct token *prev, bool vararg, const struct p
struct arg {
struct token *arg[3];
- int count[3];
};
static int collect_arguments(struct token *start, struct symbol *sym, struct arg *args, struct token *what)
{
int fixed = sym->fixed_args;
bool vararg = sym->vararg;
- struct token *arglist = sym->arglist;
- struct argcount *p;
struct token *next = NULL, *v = NULL;
const char *err;
int commas;
- arglist = arglist->next; /* skip counter */
+ memset(args, 0, sizeof(struct arg) * (fixed + 1));
for (commas = 0; commas < fixed; commas++) {
next = collect_arg(start, false, &what->pos);
if (token_type(next) != TOKEN_SPECIAL)
goto Eclosing;
- p = &arglist->next->count;
- arglist = arglist->next->next;
args[commas].arg[ARG_QUOTED] = start->next;
- args[commas].count[ARG_NORMAL] = p->normal;
- args[commas].count[ARG_QUOTED] = p->quoted;
- args[commas].count[ARG_STR] = p->str;
if (!match_op(next, ',')) {
if (commas < fixed - 1)
goto Efew;
@@ -347,13 +339,8 @@ static int collect_arguments(struct token *start, struct symbol *sym, struct arg
}
if (v && !vararg)
goto Eexcess;
- if (vararg) {
- p = &arglist->next->count;
+ if (vararg)
args[fixed].arg[ARG_QUOTED] = v;
- args[fixed].count[ARG_NORMAL] = p->normal;
- args[fixed].count[ARG_QUOTED] = p->quoted;
- args[fixed].count[ARG_STR] = p->str;
- }
what->next = next->next;
return 1;
@@ -432,29 +419,6 @@ static struct token *stringify(struct token *arg)
return token;
}
-static void expand_arguments(int count, struct arg *args)
-{
- int i;
- for (i = 0; i < count; i++) {
- struct token *arg = args[i].arg[ARG_QUOTED];
- if (!arg)
- arg = &eof_token_entry;
- if (args[i].count[ARG_STR])
- args[i].arg[ARG_STR] = stringify(arg);
- if (args[i].count[ARG_NORMAL]) {
- if (!args[i].count[ARG_QUOTED]) {
- args[i].arg[ARG_NORMAL] = arg;
- args[i].arg[ARG_QUOTED] = NULL;
- } else if (eof_token(arg)) {
- args[i].arg[ARG_NORMAL] = arg;
- } else {
- args[i].arg[ARG_NORMAL] = dup_list(arg);
- }
- expand_list(&args[i].arg[ARG_NORMAL]);
- }
- }
-}
-
/*
* Possibly valid combinations:
* - ident + ident -> ident
@@ -651,11 +615,38 @@ static int handle_kludge(const struct token **p, struct arg *args)
}
}
+static struct token *do_argument(const struct token *body,
+ struct arg *args,
+ struct ident *expanding)
+{
+ struct token *arg = args[argnum(body)].arg[argkind(body)];
+ if (arg)
+ return arg;
+ arg = args[argnum(body)].arg[ARG_QUOTED];
+ if (!arg)
+ arg = &eof_token_entry;
+ if (argkind(body) == ARG_NORMAL) {
+ if (!eof_token(arg)) {
+ if (!(body->argnum & (1 << ARGNUM_CONSUME_EXPAND)))
+ arg = dup_list(arg);
+ expanding->tainted = 0;
+ expand_list(&arg);
+ expanding->tainted = 1;
+ }
+ return args[argnum(body)].arg[ARG_NORMAL] = arg;
+ }
+ if (argkind(body) == ARG_STR)
+ return args[argnum(body)].arg[ARG_STR] = stringify(arg);
+ return arg; // ARG_QUOTED
+}
+
static struct token **substitute(struct token **list, const struct token *body, struct arg *args)
{
struct position *base_pos = &(*list)->pos;
- int *count;
enum {Normal, Placeholder, Concat} state = Normal;
+ struct ident *expanding = (*list)->ident;
+
+ expanding->tainted = 1;
for (; !eof_token(body); body = body->next) {
struct token *added, *arg;
@@ -687,8 +678,7 @@ static struct token **substitute(struct token **list, const struct token *body,
break;
case TOKEN_MACRO_ARGUMENT:
- arg = args[argnum(body)].arg[argkind(body)];
- count = &args[argnum(body)].count[argkind(body)];
+ arg = do_argument(body, args, expanding);
if (!arg || eof_token(arg)) {
if (state == Concat)
state = Normal;
@@ -696,7 +686,7 @@ static struct token **substitute(struct token **list, const struct token *body,
state = Placeholder;
continue;
}
- if (!--*count)
+ if (body->argnum & (1 << ARGNUM_CONSUME))
tail = move_into(&added, arg);
else
tail = copy(&added, arg);
@@ -755,14 +745,11 @@ static int expand(struct token **list, struct symbol *sym)
return 1;
if (!collect_arguments(token->next, sym, args, token))
return 1;
- expand_arguments(nargs, args);
}
if (sym->expand)
return sym->expand(token, args) ? 0 : 1;
- expanding->tainted = 1;
-
last = token->next;
tail = substitute(list, expansion, args);
/*
@@ -1093,8 +1080,6 @@ Eargs:
static inline void set_arg_count(struct token *token)
{
token_type(token) = TOKEN_ARG_COUNT;
- token->count.normal = token->count.quoted =
- token->count.str = 0;
}
static struct token *parse_arguments(struct token *list)
@@ -1190,10 +1175,16 @@ Eva_args:
return NULL;
}
-static int try_arg(struct token *token, enum arg_kind kind, struct token *arglist)
+struct arg_state {
+ struct token *needs_raw;
+ struct token *needs_expanded;
+ struct token *needs_str;
+};
+
+static int try_arg(struct token *token, enum arg_kind kind, struct arg_state args[])
{
struct ident *ident = token->ident;
- int nr, n;
+ int nr;
if (!macro_funclike || token_type(token) != TOKEN_IDENT)
return 0;
@@ -1204,38 +1195,31 @@ static int try_arg(struct token *token, enum arg_kind kind, struct token *arglis
if (nr == macro_nargs)
return 0;
- arglist = arglist->next;
- for (int i = 0; i < nr; i++)
- arglist = arglist->next->next;
-
token->argnum = (nr << ARGNUM_BITS_STOLEN) | kind;
token_type(token) = TOKEN_MACRO_ARGUMENT;
switch (kind) {
- case ARG_NORMAL:
- n = ++arglist->next->count.normal;
- break;
case ARG_QUOTED:
- n = ++arglist->next->count.quoted;
+ args[nr].needs_raw = token;
break;
- default:
- n = ++arglist->next->count.str;
+ case ARG_NORMAL:
+ if (!args[nr].needs_expanded)
+ args[nr].needs_raw = token;
+ args[nr].needs_expanded = token;
+ break;
+ default: // ARG_STR
+ if (!args[nr].needs_str)
+ args[nr].needs_raw = token;
+ args[nr].needs_str = token;
}
- if (n)
- return nr == macro_vararg ? 2 : 1;
- /*
- * XXX - need saner handling of that
- * (>= 1024 instances of argument)
- */
- token_type(token) = TOKEN_ERROR;
- return -1;
+ return nr == macro_vararg ? 2 : 1;
}
-static struct token *handle_hash(struct token **p, struct token *arglist)
+static struct token *handle_hash(struct token **p, struct arg_state args[])
{
struct token *token = *p;
if (macro_funclike) {
struct token *next = token->next;
- if (!try_arg(next, ARG_STR, arglist))
+ if (!try_arg(next, ARG_STR, args))
goto Equote;
next->pos.whitespace = token->pos.whitespace;
__free_token(token);
@@ -1251,13 +1235,13 @@ Equote:
}
/* token->next is ## */
-static struct token *handle_hashhash(struct token *token, struct token *arglist)
+static struct token *handle_hashhash(struct token *token, struct arg_state args[])
{
struct token *last = token;
struct token *concat;
int state = match_op(token, ',');
- try_arg(token, ARG_QUOTED, arglist);
+ try_arg(token, ARG_QUOTED, args);
while (1) {
struct token *t;
@@ -1276,12 +1260,12 @@ static struct token *handle_hashhash(struct token *token, struct token *arglist)
goto Econcat;
if (match_op(t, '#')) {
- t = handle_hash(&concat->next, arglist);
+ t = handle_hash(&concat->next, args);
if (!t)
return NULL;
}
- is_arg = try_arg(t, ARG_QUOTED, arglist);
+ is_arg = try_arg(t, ARG_QUOTED, args);
if (state == 1 && is_arg) {
state = is_arg;
@@ -1304,8 +1288,9 @@ Econcat:
return NULL;
}
-static struct token *parse_expansion(struct token *expansion, struct token *arglist, struct ident *name)
+static struct token *parse_expansion(struct token *expansion, struct ident *name)
{
+ struct arg_state args[macro_nargs] = {};
struct token *token = expansion;
struct token **p;
@@ -1314,19 +1299,30 @@ static struct token *parse_expansion(struct token *expansion, struct token *argl
for (p = &expansion; !eof_token(token); p = &token->next, token = *p) {
if (match_op(token, '#')) {
- token = handle_hash(p, arglist);
+ token = handle_hash(p, args);
if (!token)
return NULL;
}
if (match_op(token->next, SPECIAL_HASHHASH)) {
- token = handle_hashhash(token, arglist);
+ token = handle_hashhash(token, args);
if (!token)
return NULL;
} else {
- try_arg(token, ARG_NORMAL, arglist);
+ try_arg(token, ARG_NORMAL, args);
+ }
+ }
+ for (int i = 0; i < macro_nargs; i++) {
+ if (args[i].needs_str)
+ args[i].needs_str->argnum |= 1 << ARGNUM_CONSUME;
+ if (args[i].needs_expanded)
+ args[i].needs_expanded->argnum |= 1 << ARGNUM_CONSUME;
+ if (args[i].needs_raw) {
+ struct token *p = args[i].needs_raw;
+ if (argkind(p) == ARG_QUOTED)
+ p->argnum |= 1 << ARGNUM_CONSUME;
+ else if (argkind(p) == ARG_NORMAL)
+ p->argnum |= 1 << ARGNUM_CONSUME_EXPAND;
}
- if (token_type(token) == TOKEN_ERROR)
- goto Earg;
}
token = alloc_token(&expansion->pos);
token_type(token) = TOKEN_UNTAINT;
@@ -1338,9 +1334,6 @@ static struct token *parse_expansion(struct token *expansion, struct token *argl
Econcat:
sparse_error(token->pos, "'##' cannot appear at the ends of macro expansion");
return NULL;
-Earg:
- sparse_error(token->pos, "too many instances of argument in body");
- return NULL;
}
static int do_define(struct position pos, struct token *token, struct ident *name,
@@ -1349,7 +1342,7 @@ static int do_define(struct position pos, struct token *token, struct ident *nam
struct symbol *sym;
int ret = 1;
- expansion = parse_expansion(expansion, arglist, name);
+ expansion = parse_expansion(expansion, name);
if (!expansion)
goto out;
diff --git a/token.h b/token.h
index 273da39a..b28ac2ca 100644
--- a/token.h
+++ b/token.h
@@ -182,13 +182,17 @@ enum arg_kind {
};
enum {
- ARGNUM_BITS_STOLEN = 2
+ ARGNUM_CONSUME = 2,
+ ARGNUM_CONSUME_EXPAND,
+ ARGNUM_BITS_STOLEN
};
enum {
- ARGNUM_KIND_MASK = 3
+ ARGNUM_KIND_MASK = (1 << ARGNUM_CONSUME) - 1
};
+// _Static_assert(ARGNUM_KIND_MASK >= ARG_STR)
+
/*
* This is a very common data structure, it should be kept
* as small as humanly possible. Big (rare) types go as
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 11/21] kill create_arglist()
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (8 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 10/21] on-demand argument expansion Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 12/21] stop mangling arglist, get rid of TOKEN_ARG_COUNT Al Viro
` (9 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
we don't need the fake arglist for __has_extension() and its ilk anymore;
just set the ->arglist to &eof_token_entry to indicate that arguments
are expected and set ->fixed_args and ->vararg to tell how much.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 37 +++++--------------------------------
1 file changed, 5 insertions(+), 32 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index bd049620..aaf60293 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -2071,36 +2071,6 @@ static bool expand_has_feature(struct token *token, struct arg *args)
return 1;
}
-static void create_arglist(struct symbol *sym, int count)
-{
- struct token *token;
- struct token **next;
-
- if (!count)
- return;
-
- token = __alloc_token(0);
- token_type(token) = TOKEN_ARG_COUNT;
- sym->arglist = token;
- sym->fixed_args = count;
- sym->vararg = 0;
- next = &token->next;
-
- while (count--) {
- struct token *id, *uses;
- id = __alloc_token(0);
- token_type(id) = TOKEN_IDENT;
- uses = __alloc_token(0);
- token_type(uses) = TOKEN_ARG_COUNT;
- uses->count.quoted = 1;
-
- *next = id;
- id->next = uses;
- next = &uses->next;
- }
- *next = &eof_token_entry;
-}
-
static void init_preprocessor(void)
{
int i;
@@ -2172,8 +2142,11 @@ static void init_preprocessor(void)
struct symbol *sym;
sym = create_symbol(stream, dynamic[i].name, SYM_NODE, NS_MACRO);
sym->expand_simple = dynamic[i].expand_simple;
- if ((sym->expand = dynamic[i].expand) != NULL)
- create_arglist(sym, 1);
+ if ((sym->expand = dynamic[i].expand) != NULL) {
+ sym->fixed_args = 1;
+ sym->vararg = false;
+ sym->arglist = &eof_token_entry;
+ }
}
counter_macro = 0;
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 12/21] stop mangling arglist, get rid of TOKEN_ARG_COUNT
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (9 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 11/21] kill create_arglist() Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 13/21] deal with ## on arguments separately Al Viro
` (8 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Now it can be done - we no longer store the counters in arglist, so
there's no reason to mangle it. Just have it return the pointer to
closing ) on success and let the caller split the list at that point.
Simplifies both the parse_arguments() and dump_macro() and fixes
a bug in the latter - pre-C99 gcc vararg macros used to lose ... in
-dM output. They did work correctly, but dump_macro() output had
produced #define A(X,Y) instead of correct #define A(X,Y...)
Testcase added.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 47 +++++++---------------------
token.h | 9 ------
tokenize.c | 4 ---
validation/preprocessor/dump-macro.c | 4 ++-
4 files changed, 15 insertions(+), 49 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index aaf60293..a60ad687 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -1000,7 +1000,6 @@ static int token_different(struct token *t1, struct token *t2)
case TOKEN_IDENT:
different = t1->ident != t2->ident;
break;
- case TOKEN_ARG_COUNT:
case TOKEN_UNTAINT:
case TOKEN_CONCAT:
case TOKEN_GNU_KLUDGE:
@@ -1077,22 +1076,12 @@ Eargs:
return false;
}
-static inline void set_arg_count(struct token *token)
-{
- token_type(token) = TOKEN_ARG_COUNT;
-}
-
static struct token *parse_arguments(struct token *list)
{
struct token *arg = list->next, *next = list;
- set_arg_count(list);
-
- if (match_op(arg, ')')) {
- next = arg->next;
- list->next = &eof_token_entry;
- return next;
- }
+ if (match_op(arg, ')'))
+ return arg;
while (token_type(arg) == TOKEN_IDENT) {
if (arg->ident == &__VA_ARGS___ident)
@@ -1102,26 +1091,18 @@ static struct token *parse_arguments(struct token *list)
next = arg->next;
if (match_op(next, ',')) {
- set_arg_count(next);
arg = next->next;
continue;
}
- if (match_op(next, ')')) {
- set_arg_count(next);
- next = next->next;
- arg->next->next = &eof_token_entry;
+ if (match_op(next, ')'))
return next;
- }
/* normal cases are finished here */
if (match_op(next, SPECIAL_ELLIPSIS)) {
if (match_op(next->next, ')')) {
- set_arg_count(next);
macro_vararg = macro_nargs - 1;
- next = next->next;
- arg->next->next = &eof_token_entry;
return next->next;
}
@@ -1143,10 +1124,7 @@ static struct token *parse_arguments(struct token *list)
goto Enotclosed;
if (!macro_add_arg(arg->pos, &__VA_ARGS___ident))
return NULL;
- set_arg_count(next);
macro_vararg = macro_nargs - 1;
- next = next->next;
- arg->next->next = &eof_token_entry;
return next;
}
@@ -1483,14 +1461,19 @@ static int do_handle_define(struct stream *stream, struct token **line, struct t
expansion = left->next;
if (!expansion->pos.whitespace) {
if (match_op(expansion, '(')) {
- arglist = expansion;
- expansion = parse_arguments(expansion);
- if (!expansion) {
+ struct token *last = parse_arguments(expansion);
+ if (!last) {
macro_nargs = 0;
macro_vararg = -1;
return 1;
}
+ // last points to ) at the end of arguments,
+ // expansion starts right after that,
+ // everything up to that point is arglist.
macro_funclike = true;
+ arglist = expansion;
+ expansion = last->next;
+ last->next = &eof_token_entry;
} else if (!eof_token(expansion)) {
warning(expansion->pos,
"no whitespace before object-like macro body");
@@ -2281,20 +2264,14 @@ static void dump_macro(struct symbol *sym)
printf("#define %s", show_ident(sym->ident));
token = sym->arglist;
if (token) {
- const char *sep = "";
int narg = 0;
- putchar('(');
for (; !eof_token(token); token = token->next) {
- if (token_type(token) == TOKEN_ARG_COUNT)
- continue;
- printf("%s%s", sep, show_token(token));
+ printf("%s", show_token(token));
if (token_type(token) == TOKEN_IDENT)
args[narg++] = token->ident;
- sep = ",";
}
if (narg < nargs)
args[narg] = &__VA_ARGS___ident;
- putchar(')');
}
token = sym->expansion;
diff --git a/token.h b/token.h
index b28ac2ca..e469e02d 100644
--- a/token.h
+++ b/token.h
@@ -103,7 +103,6 @@ enum token_type {
TOKEN_CONCAT,
TOKEN_GNU_KLUDGE,
TOKEN_UNTAINT,
- TOKEN_ARG_COUNT,
TOKEN_IF,
TOKEN_SKIP_GROUPS,
TOKEN_ELSE,
@@ -168,13 +167,6 @@ struct string {
char data[];
};
-/* will fit into 32 bits */
-struct argcount {
- unsigned normal:10;
- unsigned quoted:10;
- unsigned str:10;
-};
-
enum arg_kind {
ARG_QUOTED = 0,
ARG_NORMAL = 1,
@@ -207,7 +199,6 @@ struct token {
unsigned int special;
struct string *string;
int argnum;
- struct argcount count;
char embedded[4];
};
};
diff --git a/tokenize.c b/tokenize.c
index 54ea348c..85bc3f49 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -241,10 +241,6 @@ const char *show_token(const struct token *token)
sprintf(buffer, "<untaint>");
return buffer;
- case TOKEN_ARG_COUNT:
- sprintf(buffer, "<argcnt>");
- return buffer;
-
default:
sprintf(buffer, "unhandled token type '%d' ", token_type(token));
return buffer;
diff --git a/validation/preprocessor/dump-macro.c b/validation/preprocessor/dump-macro.c
index 46d70b34..710c1027 100644
--- a/validation/preprocessor/dump-macro.c
+++ b/validation/preprocessor/dump-macro.c
@@ -1,9 +1,11 @@
#define A(X,Y,...) __VA_ARGS__,Y,X
+#define B(X,Y...) Y
/*
* check-name: -dM handling of varargs
- * check-command: sparse -E -dM $file | tail -1
+ * check-command: sparse -E -dM $file | tail -2
*
* check-output-start
#define A(X,Y,...) __VA_ARGS__,Y,X
+#define B(X,Y...) Y
* check-output-end
*/
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 13/21] deal with ## on arguments separately
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (10 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 12/21] stop mangling arglist, get rid of TOKEN_ARG_COUNT Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 14/21] preparations for __VA_OPT__ support: reshuffle argument slot assignments Al Viro
` (7 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Adding/concatenating the chunks to growing expansion is done in the end
of loop body in substitute(); preceding switch leaves the data for it in
two variables - 'added' is the first token of the next chunk and 'tail'
points to the forward pointer in the last token of that chunk.
The only case when we might be adding more than one token is macro
argument; forcing it to use the same path as everything else complicates
things for no good reason, especially when it comes to concatenation.
Let the TOKEN_MACRO_ARGUMENT case deal with that stuff on its own.
In case of concatenation let it merge the first token before
copying/inserting the rest; that simplifies the common case and it
simplifies the data flow for everyone since we don't need to bother with
'tail' anymore.
As a side benefit, merge() is no longer inlined, which reduces the spills.
That chunk could go after __VA_OPT__ handling, but having it done first
simplifies the things for __VA_OPT__ (and especially for #__VA_OPT__()),
so let's put that one first.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 38 +++++++++++++++++++++++---------------
1 file changed, 23 insertions(+), 15 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index a60ad687..16cec8e1 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -650,7 +650,7 @@ static struct token **substitute(struct token **list, const struct token *body,
for (; !eof_token(body); body = body->next) {
struct token *added, *arg;
- struct token **tail;
+ struct token **inserted_at;
const struct token *t;
switch (token_type(body)) {
@@ -674,7 +674,6 @@ static struct token **substitute(struct token **list, const struct token *body,
}
added = dup_token(t, base_pos);
token_type(added) = TOKEN_SPECIAL;
- tail = &added->next;
break;
case TOKEN_MACRO_ARGUMENT:
@@ -686,13 +685,28 @@ static struct token **substitute(struct token **list, const struct token *body,
state = Placeholder;
continue;
}
+ if (state == Concat && merge(containing_token(list), arg)) {
+ arg = arg->next;
+ if (eof_token(arg)) {
+ // merged the sole token in
+ state = Normal;
+ continue;
+ }
+ inserted_at = NULL;
+ } else {
+ inserted_at = list;
+ }
if (body->argnum & (1 << ARGNUM_CONSUME))
- tail = move_into(&added, arg);
+ list = move_into(list, arg);
else
- tail = copy(&added, arg);
- added->pos.newline = body->pos.newline;
- added->pos.whitespace = body->pos.whitespace;
- break;
+ list = copy(list, arg);
+ if (inserted_at) {
+ struct token *p = *inserted_at;
+ p->pos.whitespace = body->pos.whitespace;
+ p->pos.newline = 0;
+ }
+ state = Normal;
+ continue;
case TOKEN_CONCAT:
if (state == Placeholder)
@@ -703,7 +717,6 @@ static struct token **substitute(struct token **list, const struct token *body,
default:
added = dup_token(body, base_pos);
- tail = &added->next;
break;
}
@@ -711,17 +724,12 @@ static struct token **substitute(struct token **list, const struct token *body,
* if we got to doing real concatenation, we already have
* added something into the list, so containing_token() is OK.
*/
- if (state == Concat && merge(containing_token(list), added)) {
- *list = added->next;
- if (tail != &added->next)
- list = tail;
- } else {
+ if (state != Concat || !merge(containing_token(list), added)) {
*list = added;
- list = tail;
+ list = &added->next;
}
state = Normal;
}
- *list = &eof_token_entry;
return list;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 14/21] preparations for __VA_OPT__ support: reshuffle argument slot assignments
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (11 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 13/21] deal with ## on arguments separately Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 15/21] pre-process.c: split try_arg() Al Viro
` (6 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Move the vararg to slot 0, with the non-vararg arguments in slots
1..fixed_args; for macros with vararg arguments leave slot 0 unused.
Rationale: handling of __VA_OPT__ at expansion time will need to locate
the vararg; having it always in the same slot makes life easier.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 34 ++++++++++++++++++----------------
1 file changed, 18 insertions(+), 16 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index 16cec8e1..fed3dc2a 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -321,7 +321,7 @@ static int collect_arguments(struct token *start, struct symbol *sym, struct arg
next = collect_arg(start, false, &what->pos);
if (token_type(next) != TOKEN_SPECIAL)
goto Eclosing;
- args[commas].arg[ARG_QUOTED] = start->next;
+ args[commas + 1].arg[ARG_QUOTED] = start->next;
if (!match_op(next, ',')) {
if (commas < fixed - 1)
goto Efew;
@@ -340,7 +340,7 @@ static int collect_arguments(struct token *start, struct symbol *sym, struct arg
if (v && !vararg)
goto Eexcess;
if (vararg)
- args[fixed].arg[ARG_QUOTED] = v;
+ args[0].arg[ARG_QUOTED] = v;
what->next = next->next;
return 1;
@@ -740,8 +740,7 @@ static int expand(struct token **list, struct symbol *sym)
struct ident *expanding = token->ident;
struct token **tail;
struct token *expansion = sym->expansion;
- int nargs = sym->fixed_args + sym->vararg;
- struct arg args[nargs];
+ struct arg args[sym->fixed_args + 1];
if (expanding->tainted) {
token->pos.noexpand = 1;
@@ -1181,6 +1180,7 @@ static int try_arg(struct token *token, enum arg_kind kind, struct arg_state arg
if (nr == macro_nargs)
return 0;
+ nr = nr == macro_vararg ? 0 : nr + 1;
token->argnum = (nr << ARGNUM_BITS_STOLEN) | kind;
token_type(token) = TOKEN_MACRO_ARGUMENT;
switch (kind) {
@@ -1197,7 +1197,7 @@ static int try_arg(struct token *token, enum arg_kind kind, struct arg_state arg
args[nr].needs_raw = token;
args[nr].needs_str = token;
}
- return nr == macro_vararg ? 2 : 1;
+ return nr == 0 ? 2 : 1;
}
static struct token *handle_hash(struct token **p, struct arg_state args[])
@@ -1276,7 +1276,8 @@ Econcat:
static struct token *parse_expansion(struct token *expansion, struct ident *name)
{
- struct arg_state args[macro_nargs] = {};
+ int slots = macro_nargs + (macro_vararg < 0);
+ struct arg_state args[slots] = {};
struct token *token = expansion;
struct token **p;
@@ -1297,7 +1298,7 @@ static struct token *parse_expansion(struct token *expansion, struct ident *name
try_arg(token, ARG_NORMAL, args);
}
}
- for (int i = 0; i < macro_nargs; i++) {
+ for (int i = 0; i < slots; i++) {
if (args[i].needs_str)
args[i].needs_str->argnum |= 1 << ARGNUM_CONSUME;
if (args[i].needs_expanded)
@@ -1975,7 +1976,7 @@ static int handle_nondirective(struct stream *stream, struct token **line, struc
static struct token *first_arg(struct arg *args)
{
- struct token *arg = args[0].arg[ARG_QUOTED];
+ struct token *arg = args[1].arg[ARG_QUOTED];
expand_list(&arg);
return arg;
}
@@ -2265,21 +2266,22 @@ struct token * preprocess(struct token *token)
static void dump_macro(struct symbol *sym)
{
- int nargs = sym->fixed_args + sym->vararg;
- struct ident *args[nargs];
+ int fixed_args = sym->fixed_args;
+ struct ident *args[fixed_args + 1];
struct token *token;
printf("#define %s", show_ident(sym->ident));
token = sym->arglist;
if (token) {
- int narg = 0;
- for (; !eof_token(token); token = token->next) {
+ args[0] = &__VA_ARGS___ident;
+ for (int n = 1; !eof_token(token); token = token->next) {
printf("%s", show_token(token));
- if (token_type(token) == TOKEN_IDENT)
- args[narg++] = token->ident;
+ if (token_type(token) == TOKEN_IDENT) {
+ args[n] = token->ident;
+ if (n++ == fixed_args)
+ n = 0;
+ }
}
- if (narg < nargs)
- args[narg] = &__VA_ARGS___ident;
}
token = sym->expansion;
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 15/21] pre-process.c: split try_arg()
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (12 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 14/21] preparations for __VA_OPT__ support: reshuffle argument slot assignments Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 16/21] __VA_OPT__: parsing Al Viro
` (5 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
more __VA_OPT__ preparations - we want to split the "parse the possible
variable" from the parts that are sensitive to the kind of variable
access (in particular, to subsequent ## being or not being there).
With __VA_OPT__ we'll have a possibility of relevant ## being a lot
further ahead than the next token and we won't find it until we'd parsed
the entire __VA_OPT__(.....).
We could check for __VA_OPT__ _before_ checking for arguments, but that
ends up screwing code generation a lot, slowing down the normal case
where we've not a single __VA_OPT__ in the input.
Replace try_arg() with two new primitives:
* check_arg() - returns 0 if the next token is not an argument;
if the token is an argument, it gets converted to TOKEN_MACRO_ARGUMENT and
slot number + 1 is returned. That function gets only token and arg_state
array - 'kind' is not known yet. At the moment 'args' is not needed,
but it will be needed for __VA_OPT__ handling, so that argument stays.
Note that unlike try_arg() we don't need a special return value to tell
vararg from non-vararg argument - the slot number is sufficient now.
It's a vararg if and only if it occupies slot 0, i.e. if check_arg()
has returned 1.
* seen_arg() - gets called only for TOKEN_MACRO_ARGUMENT token,
does the rest of what try_arg() used to do. Returns void.
Calls of try_arg() are replaced with combinations of these two, the
first try_arg() in handle_hashhash() lifted into the only caller of
handle_hashhash() and its check_arg() folded with the one we do for non-##
case there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 37 ++++++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 11 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index fed3dc2a..51ad916c 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -1166,7 +1166,7 @@ struct arg_state {
struct token *needs_str;
};
-static int try_arg(struct token *token, enum arg_kind kind, struct arg_state args[])
+static int check_arg(struct token *token, struct arg_state args[])
{
struct ident *ident = token->ident;
int nr;
@@ -1181,8 +1181,14 @@ static int try_arg(struct token *token, enum arg_kind kind, struct arg_state arg
return 0;
nr = nr == macro_vararg ? 0 : nr + 1;
- token->argnum = (nr << ARGNUM_BITS_STOLEN) | kind;
+ token->argnum = nr << ARGNUM_BITS_STOLEN;
token_type(token) = TOKEN_MACRO_ARGUMENT;
+ return nr + 1;
+}
+
+static void seen_arg(struct token *token, enum arg_kind kind, struct arg_state args[], int nr)
+{
+ token->argnum |= kind;
switch (kind) {
case ARG_QUOTED:
args[nr].needs_raw = token;
@@ -1197,7 +1203,6 @@ static int try_arg(struct token *token, enum arg_kind kind, struct arg_state arg
args[nr].needs_raw = token;
args[nr].needs_str = token;
}
- return nr == 0 ? 2 : 1;
}
static struct token *handle_hash(struct token **p, struct arg_state args[])
@@ -1205,8 +1210,12 @@ static struct token *handle_hash(struct token **p, struct arg_state args[])
struct token *token = *p;
if (macro_funclike) {
struct token *next = token->next;
- if (!try_arg(next, ARG_STR, args))
+ int nr = check_arg(next, args);
+
+ if (!nr)
goto Equote;
+
+ seen_arg(next, ARG_STR, args, nr - 1);
next->pos.whitespace = token->pos.whitespace;
__free_token(token);
token = *p = next;
@@ -1226,12 +1235,10 @@ static struct token *handle_hashhash(struct token *token, struct arg_state args[
struct token *last = token;
struct token *concat;
int state = match_op(token, ',');
-
- try_arg(token, ARG_QUOTED, args);
+ int nr;
while (1) {
struct token *t;
- int is_arg;
/* eat duplicate ## */
concat = token->next;
@@ -1251,10 +1258,13 @@ static struct token *handle_hashhash(struct token *token, struct arg_state args[
return NULL;
}
- is_arg = try_arg(t, ARG_QUOTED, args);
+ nr = check_arg(t, args);
+ if (nr > 0)
+ seen_arg(t, ARG_QUOTED, args, nr - 1);
- if (state == 1 && is_arg) {
- state = is_arg;
+ if (state == 1 && nr > 0) {
+ if (nr == 1)
+ state = 2;
} else {
last = t;
state = match_op(t, ',');
@@ -1280,6 +1290,7 @@ static struct token *parse_expansion(struct token *expansion, struct ident *name
struct arg_state args[slots] = {};
struct token *token = expansion;
struct token **p;
+ int nr;
if (match_op(token, SPECIAL_HASHHASH))
goto Econcat;
@@ -1290,12 +1301,16 @@ static struct token *parse_expansion(struct token *expansion, struct ident *name
if (!token)
return NULL;
}
+ nr = check_arg(token, args);
if (match_op(token->next, SPECIAL_HASHHASH)) {
+ if (nr > 0)
+ seen_arg(token, ARG_QUOTED, args, nr - 1);
token = handle_hashhash(token, args);
if (!token)
return NULL;
} else {
- try_arg(token, ARG_NORMAL, args);
+ if (nr > 0)
+ seen_arg(token, ARG_NORMAL, args, nr - 1);
}
}
for (int i = 0; i < slots; i++) {
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 16/21] __VA_OPT__: parsing
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (13 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 15/21] pre-process.c: split try_arg() Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 17/21] expansion-time va_opt handling Al Viro
` (4 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
va-opt-replacement can occur in any place where a macro argument of a
vararg macro might. It consists of identifier __VA_OPT__, followed by
'(', a sequence of pp-tokens with balanced parentheses (body of that
va-opt-replacement) and finally a ')'.
Body of va-opt-replacement may not contain __VA_OPT__ and may not begin
or end with a ##.
At the expansion time va-opt-replacement is handled at the same stage as
argument substitution. What happens depends upon the value of __VA_ARGS__
- if it would expand to an empty token sequence, each va-opt-replacement
in the body is treated the same way as an occurrence of an empty argument
(replaced by an empty string literal if preceded by a # operator and by
placemarker token otherwise).
If __VA_ARGS__ does *not* expand to an empty token sequence, the body of
va-opt-replacement is subjected to argument substitution and # processing,
as if it had been the entire macro body. Leading and trailing whitespace
is stripped from the result. If va-opt-replacement is not preceded by
a # operator, the resulting list is substituted in its place. If it
*is* preceded by a # operator, the resulting list is subjected to ##
processing/placemarker removal and converted into a string literal token.
That token is substituted in place of # va-opt-replacement combination.
All of that is followed by usual processing of remaining ## operators and
placemarker removal (we are, of course, allowed to calculate the individual
token concatenations earlier, provided that end result is the same).
For non-stringified instances it's _almost_ the same as if all
va-opt-replacements had been replaced with their bodies in case when
__VA_ARGS__ expands to non-empty sequence of tokens; the only difference
is that ## next to va-opt-replacement does not suppress expansion of
arguments inside; for example
#define FOO BAR
#define A(X) X ## 1 // X is not expanded
#define B(X,...) __VA_OPT__(X) ## 1 // X is expanded
A(FOO)
B(FOO,_)
B(FOO)
yields
FOO1
BAR1
1
Any ## inside the va-opt-replacement still have the usual effect on the
adjacent macro arguments.
In other words, for non-stringified __VA_OPT__ we can simply
* parse its body as if it had been an entire macro (with the
usual handling of arguments)
* when substitute() gets to va-opt-replacement, check if expansion
of __VA_ARGS__ is empty
* if it is, just do what we do when seeing an empty argument,
otherwise switch to taking tokens to interpret from the body of that
va-opt-replacement until we reach its end, then proceed to interpret
the rest of the body of our macro.
For stringified __VA_OPT__ we need to save the state of interpreter (body,
list, state), switch to (body of va-opt-replacement, private list, Normal)
and once we are done stringify the private list, restore the saved state
and add the string token we've got to the main list, same as usual.
Note on whitespace handling: whitespace in front of the first token
coming from va-opt-replacement is _not_ affected by whatever whitespace
we might have between __VA_OPT__ and '(' or '(' and the body; only
the whitespace preceding the __VA_OPT__ itself matters.
Representation:
* new token types: TOKEN_VA_OPT and TOKEN_VA_OPT_STR; va-opt-replacement
and # va-opt-replacement resp. get converted to that, with the body +
surrounding parentheses stripped from the list and reference to the
opening parenthesis stored into ->va_opt_linkage of the converted __VA_OPT__
token.
Closing parenthesis is converted to TOKEN_VA_OPT; its ->next points to
eof_token_entry to make it distinguishable from the normal TOKEN_VA_OPT
and its ->va_opt_linkage points back to the originating TOKEN_VA_OPT or
TOKEN_VA_OPT_STR - basically, that will serve as return instruction.
We could add a separate token type for that, but that would only make
things more inconvenient at expansion time.
Note that in all cases ->va_opt_linkage points to the token immediately
preceding the ones we should proceed to; that will simplify life at
expansion time.
This commit contains the parser side of the things. Substitution side
is done in the next one.
* check_arg() taught to recognize and parse __VA_OPT__(...); returns -1
on failure and 0 (not an argument of macro) on success. Callers updated.
* dump_macro() and token_list_different() taught to handle those.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
ident-list.h | 1 +
pre-process.c | 233 ++++++++++++++++++-----
token.h | 3 +
validation/preprocessor/dump-macro.c | 4 +-
validation/preprocessor/va_opt_compare.c | 28 +++
validation/preprocessor/va_opt_parse.c | 37 ++++
6 files changed, 258 insertions(+), 48 deletions(-)
create mode 100644 validation/preprocessor/va_opt_compare.c
create mode 100644 validation/preprocessor/va_opt_parse.c
diff --git a/ident-list.h b/ident-list.h
index 3c08e8ca..556d4050 100644
--- a/ident-list.h
+++ b/ident-list.h
@@ -65,6 +65,7 @@ IDENT(c_generic_selections);
IDENT(c_static_assert);
__IDENT(pragma_ident, "__pragma__", 0);
__IDENT(__VA_ARGS___ident, "__VA_ARGS__", 0);
+__IDENT(__VA_OPT___ident, "__VA_OPT__", 0);
__IDENT(__func___ident, "__func__", 0);
__IDENT(__FUNCTION___ident, "__FUNCTION__", 0);
__IDENT(__PRETTY_FUNCTION___ident, "__PRETTY_FUNCTION__", 0);
diff --git a/pre-process.c b/pre-process.c
index 51ad916c..0f0dbc56 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -640,6 +640,11 @@ static struct token *do_argument(const struct token *body,
return arg; // ARG_QUOTED
}
+static bool is_end_va_opt(const struct token *token)
+{
+ return eof_token(token->next);
+}
+
static struct token **substitute(struct token **list, const struct token *body, struct arg *args)
{
struct position *base_pos = &(*list)->pos;
@@ -996,6 +1001,8 @@ static int handle_argv_include(struct stream *stream, struct token **list, struc
return handle_include_path(stream, list, token, 2);
}
+static int token_list_different(struct token *, struct token *);
+
static int token_different(struct token *t1, struct token *t2)
{
int different;
@@ -1039,6 +1046,29 @@ static int token_different(struct token *t1, struct token *t2)
different = memcmp(s1->data, s2->data, s1->length);
break;
}
+ case TOKEN_VA_OPT:
+ if (is_end_va_opt(t1)) {
+ /*
+ * t1 is a return (at the end of __VA_OPT__ body);
+ * the same should be true for t2 and that's it.
+ */
+ different = !is_end_va_opt(t2);
+ break;
+ }
+ /*
+ * t1 is a real __VA_OPT__; the same should be true for
+ * t2...
+ */
+ if (is_end_va_opt(t2)) {
+ different = 1;
+ break;
+ }
+ /* ... and their bodies should not be different */
+ /* fall-through */
+ case TOKEN_VA_OPT_STR:
+ different = token_list_different(t1->va_opt_linkage,
+ t2->va_opt_linkage);
+ break;
default:
different = 1;
break;
@@ -1083,6 +1113,13 @@ Eargs:
return false;
}
+static void misplaced_va_xxx(struct token *arg)
+{
+ sparse_error(arg->pos,
+ "%s can only appear in the expansion of a C99 variadic macro",
+ show_token(arg));
+}
+
static struct token *parse_arguments(struct token *list)
{
struct token *arg = list->next, *next = list;
@@ -1091,7 +1128,8 @@ static struct token *parse_arguments(struct token *list)
return arg;
while (token_type(arg) == TOKEN_IDENT) {
- if (arg->ident == &__VA_ARGS___ident)
+ if (arg->ident == &__VA_ARGS___ident ||
+ arg->ident == &__VA_OPT___ident)
goto Eva_args;
if (!macro_add_arg(arg->pos, arg->ident))
return NULL;
@@ -1156,7 +1194,7 @@ Enotclosed:
sparse_error(arg->pos, "missing ')' in macro parameter list");
return NULL;
Eva_args:
- sparse_error(arg->pos, "__VA_ARGS__ can only appear in the expansion of a C99 variadic macro");
+ misplaced_va_xxx(arg);
return NULL;
}
@@ -1166,24 +1204,84 @@ struct arg_state {
struct token *needs_str;
};
+static bool in_va_opt;
+
+static struct token **parse_body(struct token **list, struct arg_state args[]);
+
+static int parse_va_opt(struct token *token, struct arg_state args[])
+{
+ struct token **p = &token->next;
+ struct token *next = *p;
+ int nesting = 0;
+
+ if (macro_vararg < 0)
+ goto Evararg;
+ if (in_va_opt)
+ goto Enested;
+
+ if (!match_op(next, '('))
+ goto Eunterminated;
+ token_type(token) = TOKEN_VA_OPT;
+ token->va_opt_linkage = next;
+ next->next->pos.whitespace = token->pos.whitespace;
+ for (; !eof_token(next); p = &next->next, next = *p) {
+ if (token_type(next) != TOKEN_SPECIAL)
+ continue;
+ if (next->special == ')') {
+ if (!--nesting) {
+ *p = &eof_token_entry; // cut prior to that ')'
+ in_va_opt = true;
+ p = parse_body(&token->va_opt_linkage->next, args);
+ in_va_opt = false;
+ if (!p)
+ return -1;
+ // strip everything up to ')' from the list
+ token->next = next->next;
+ // convert the ')' into return
+ token_type(next) = TOKEN_VA_OPT;
+ next->va_opt_linkage = token;
+ next->next = &eof_token_entry;
+ // and reattach it to the end of body
+ *p = next;
+ return 0;
+ }
+ } else if (next->special == '(')
+ nesting++;
+ }
+Eunterminated:
+ sparse_error(token->pos, "unterminated __VA_OPT__");
+ return -1;
+
+Enested:
+ sparse_error(token->pos, "__VA_OPT__ may not appear in a __VA_OPT__");
+ return -1;
+Evararg:
+ misplaced_va_xxx(token);
+ return -1;
+}
+
static int check_arg(struct token *token, struct arg_state args[])
{
- struct ident *ident = token->ident;
+ struct ident *ident;
int nr;
- if (!macro_funclike || token_type(token) != TOKEN_IDENT)
+ if (!macro_nargs || token_type(token) != TOKEN_IDENT)
return 0;
+ ident = token->ident;
for (nr = 0; nr < macro_nargs && macro_arg_name[nr] != ident; nr++)
;
- if (nr == macro_nargs)
- return 0;
+ if (nr < macro_nargs) {
+ nr = nr == macro_vararg ? 0 : nr + 1;
+ token->argnum = nr << ARGNUM_BITS_STOLEN;
+ token_type(token) = TOKEN_MACRO_ARGUMENT;
+ return nr + 1;
+ }
- nr = nr == macro_vararg ? 0 : nr + 1;
- token->argnum = nr << ARGNUM_BITS_STOLEN;
- token_type(token) = TOKEN_MACRO_ARGUMENT;
- return nr + 1;
+ if (ident != &__VA_OPT___ident)
+ return 0;
+ return parse_va_opt(token, args);
}
static void seen_arg(struct token *token, enum arg_kind kind, struct arg_state args[], int nr)
@@ -1210,13 +1308,19 @@ static struct token *handle_hash(struct token **p, struct arg_state args[])
struct token *token = *p;
if (macro_funclike) {
struct token *next = token->next;
- int nr = check_arg(next, args);
-
- if (!nr)
- goto Equote;
+ int nr;
- seen_arg(next, ARG_STR, args, nr - 1);
next->pos.whitespace = token->pos.whitespace;
+
+ nr = check_arg(next, args);
+ if (nr < 0)
+ return NULL;
+ if (token_type(next) == TOKEN_MACRO_ARGUMENT)
+ seen_arg(next, ARG_STR, args, nr - 1);
+ else if (token_type(next) == TOKEN_VA_OPT)
+ token_type(next) = TOKEN_VA_OPT_STR;
+ else
+ goto Equote;
__free_token(token);
token = *p = next;
} else {
@@ -1259,6 +1363,8 @@ static struct token *handle_hashhash(struct token *token, struct arg_state args[
}
nr = check_arg(t, args);
+ if (nr < 0)
+ return NULL;
if (nr > 0)
seen_arg(t, ARG_QUOTED, args, nr - 1);
@@ -1284,24 +1390,24 @@ Econcat:
return NULL;
}
-static struct token *parse_expansion(struct token *expansion, struct ident *name)
+static struct token **parse_body(struct token **list, struct arg_state args[])
{
- int slots = macro_nargs + (macro_vararg < 0);
- struct arg_state args[slots] = {};
- struct token *token = expansion;
- struct token **p;
- int nr;
+ struct token *token = *list;
if (match_op(token, SPECIAL_HASHHASH))
goto Econcat;
- for (p = &expansion; !eof_token(token); p = &token->next, token = *p) {
+ while (!eof_token(token)) {
+ int nr;
+
if (match_op(token, '#')) {
- token = handle_hash(p, args);
+ token = handle_hash(list, args);
if (!token)
return NULL;
}
nr = check_arg(token, args);
+ if (nr < 0)
+ return NULL;
if (match_op(token->next, SPECIAL_HASHHASH)) {
if (nr > 0)
seen_arg(token, ARG_QUOTED, args, nr - 1);
@@ -1312,7 +1418,26 @@ static struct token *parse_expansion(struct token *expansion, struct ident *name
if (nr > 0)
seen_arg(token, ARG_NORMAL, args, nr - 1);
}
+ list = &token->next;
+ token = *list;
}
+ return list;
+
+Econcat:
+ sparse_error(token->pos, "'##' cannot appear at the ends of macro expansion");
+ return NULL;
+}
+
+static struct token *parse_expansion(struct token *expansion, struct ident *name)
+{
+ int slots = macro_nargs + (macro_vararg < 0);
+ struct arg_state args[slots] = {};
+ struct token **tail;
+ struct token *token;
+
+ tail = parse_body(&expansion, args);
+ if (!tail)
+ return NULL;
for (int i = 0; i < slots; i++) {
if (args[i].needs_str)
args[i].needs_str->argnum |= 1 << ARGNUM_CONSUME;
@@ -1329,13 +1454,9 @@ static struct token *parse_expansion(struct token *expansion, struct ident *name
token = alloc_token(&expansion->pos);
token_type(token) = TOKEN_UNTAINT;
token->ident = name;
- token->next = *p;
- *p = token;
+ token->next = &eof_token_entry;
+ *tail = token;
return expansion;
-
-Econcat:
- sparse_error(token->pos, "'##' cannot appear at the ends of macro expansion");
- return NULL;
}
static int do_define(struct position pos, struct token *token, struct ident *name,
@@ -2279,6 +2400,40 @@ struct token * preprocess(struct token *token)
return token;
}
+static void dump_body(struct token *token, struct ident *args[])
+{
+ bool first = true;
+ while (!eof_token(token) && token_type(token) != TOKEN_UNTAINT) {
+ struct token *next = token->next;
+ if (!first && token->pos.whitespace)
+ putchar(' ');
+ first = false;
+ switch (token_type(token)) {
+ case TOKEN_CONCAT:
+ printf("##");
+ break;
+ case TOKEN_MACRO_ARGUMENT:
+ if (argkind(token) == ARG_STR)
+ printf("#");
+ printf("%s", show_ident(args[argnum(token)]));
+ break;
+ default:
+ printf("%s", show_token(token));
+ break;
+ case TOKEN_VA_OPT_STR:
+ printf("#");
+ /* fall-through */
+ case TOKEN_VA_OPT:
+ if (is_end_va_opt(token))
+ break;
+ printf("__VA_OPT__(");
+ dump_body(token->va_opt_linkage->next, args);
+ printf(")");
+ }
+ token = next;
+ }
+}
+
static void dump_macro(struct symbol *sym)
{
int fixed_args = sym->fixed_args;
@@ -2298,26 +2453,10 @@ static void dump_macro(struct symbol *sym)
}
}
}
+ putchar(' ');
token = sym->expansion;
- while (token_type(token) != TOKEN_UNTAINT) {
- struct token *next = token->next;
- if (token->pos.whitespace)
- putchar(' ');
- switch (token_type(token)) {
- case TOKEN_CONCAT:
- printf("##");
- break;
- case TOKEN_MACRO_ARGUMENT:
- if (argkind(token) == ARG_STR)
- printf("#");
- printf("%s", show_ident(args[argnum(token)]));
- break;
- default:
- printf("%s", show_token(token));
- }
- token = next;
- }
+ dump_body(token, args);
putchar('\n');
}
diff --git a/token.h b/token.h
index e469e02d..3edf4ce1 100644
--- a/token.h
+++ b/token.h
@@ -102,6 +102,8 @@ enum token_type {
TOKEN_MACRO_ARGUMENT,
TOKEN_CONCAT,
TOKEN_GNU_KLUDGE,
+ TOKEN_VA_OPT,
+ TOKEN_VA_OPT_STR,
TOKEN_UNTAINT,
TOKEN_IF,
TOKEN_SKIP_GROUPS,
@@ -199,6 +201,7 @@ struct token {
unsigned int special;
struct string *string;
int argnum;
+ struct token *va_opt_linkage;
char embedded[4];
};
};
diff --git a/validation/preprocessor/dump-macro.c b/validation/preprocessor/dump-macro.c
index 710c1027..b0085840 100644
--- a/validation/preprocessor/dump-macro.c
+++ b/validation/preprocessor/dump-macro.c
@@ -1,11 +1,13 @@
#define A(X,Y,...) __VA_ARGS__,Y,X
#define B(X,Y...) Y
+#define C(...) __VA_OPT__(1 #__VA_ARGS__) #__VA_OPT__(1 __VA_ARGS__)
/*
* check-name: -dM handling of varargs
- * check-command: sparse -E -dM $file | tail -2
+ * check-command: sparse -E -dM $file | tail -3
*
* check-output-start
#define A(X,Y,...) __VA_ARGS__,Y,X
#define B(X,Y...) Y
+#define C(...) __VA_OPT__(1 #__VA_ARGS__) #__VA_OPT__(1 __VA_ARGS__)
* check-output-end
*/
diff --git a/validation/preprocessor/va_opt_compare.c b/validation/preprocessor/va_opt_compare.c
new file mode 100644
index 00000000..ad15cabe
--- /dev/null
+++ b/validation/preprocessor/va_opt_compare.c
@@ -0,0 +1,28 @@
+#define OK1(X,...) __VA_OPT__(X =)
+#define OK1(X,...) __VA_OPT__(X =)
+#define OK2(X,...) #__VA_OPT__(X =)
+#define OK2(X,...) #__VA_OPT__(X =)
+#define BAD1(X,...) __VA_OPT__(X)
+#define BAD1(X,...) __VA_OPT__(_)
+#define BAD2(X,...) __VA_OPT__(,)
+#define BAD2(X,...) ,
+#define BAD3(X,...) __VA_OPT__(,)
+#define BAD3(X,...) #__VA_OPT__(,)
+/*
+ * check-name: __VA_OPT__ comparison
+ * check-command: sparse -E $file
+ *
+ * check-output-start
+
+
+ * check-output-end
+ *
+ * check-error-start
+preprocessor/va_opt_compare.c:6:9: warning: preprocessor token BAD1 redefined
+preprocessor/va_opt_compare.c:5:9: this was the original definition
+preprocessor/va_opt_compare.c:8:9: warning: preprocessor token BAD2 redefined
+preprocessor/va_opt_compare.c:7:9: this was the original definition
+preprocessor/va_opt_compare.c:10:9: warning: preprocessor token BAD3 redefined
+preprocessor/va_opt_compare.c:9:9: this was the original definition
+ * check-error-end
+ */
diff --git a/validation/preprocessor/va_opt_parse.c b/validation/preprocessor/va_opt_parse.c
new file mode 100644
index 00000000..4eb8675d
--- /dev/null
+++ b/validation/preprocessor/va_opt_parse.c
@@ -0,0 +1,37 @@
+#define A(__VA_OPT__)
+#define B(X) __VA_OPT__(_)
+#define C(X,...) __VA_OPT__(__VA_OPT__(_))
+#define D(X,...) __VA_OPT__
+#define E(X,...) __VA_OPT__(_
+#define OK(X,...) __VA_OPT__()
+#define OK2(X,...) __VA_OPT__(,(,,),)
+#define F(X,...) __VA_OPT__(,(,,,)
+#define OK3(X,...) __VA_OPT__(,(,,),))
+#define G1(...) __VA_OPT__(##)
+#define G2(...) __VA_OPT__(##,)
+#define G3(...) __VA_OPT__(,##)
+#define H(...) __VA_OPT__(#1)
+#define OK4(X,...) __VA_OPT__(__VA_ARGS__,#X)
+#define OK5(X,...) #__VA_OPT__(__VA_ARGS__,#X)
+/*
+ * check-name: __VA_OPT__ parsing
+ * check-command: sparse -E $file
+ *
+ * check-output-start
+
+
+ * check-output-end
+ *
+ * check-error-start
+preprocessor/va_opt_parse.c:1:11: error: __VA_OPT__ can only appear in the expansion of a C99 variadic macro
+preprocessor/va_opt_parse.c:2:14: error: __VA_OPT__ can only appear in the expansion of a C99 variadic macro
+preprocessor/va_opt_parse.c:3:29: error: __VA_OPT__ may not appear in a __VA_OPT__
+preprocessor/va_opt_parse.c:4:18: error: unterminated __VA_OPT__
+preprocessor/va_opt_parse.c:5:18: error: unterminated __VA_OPT__
+preprocessor/va_opt_parse.c:8:18: error: unterminated __VA_OPT__
+preprocessor/va_opt_parse.c:10:28: error: '##' cannot appear at the ends of macro expansion
+preprocessor/va_opt_parse.c:11:28: error: '##' cannot appear at the ends of macro expansion
+preprocessor/va_opt_parse.c:12:29: error: '##' cannot appear at the ends of macro expansion
+preprocessor/va_opt_parse.c:13:27: error: '#' is not followed by a macro parameter
+ * check-error-end
+ */
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 17/21] expansion-time va_opt handling
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (14 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 16/21] __VA_OPT__: parsing Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 18/21] merge(): saner handling of ->noexpand Al Viro
` (3 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Teach the interpreter (== substitute()) to handle TOKEN_VA_OPT and
TOKEN_VA_OPT_STR.
Two tricky parts, both related to calculating when an argument can be
consumed. One is that in situation like
#define A(x,...) __VA_OPT__(x) foo_##x x
we might end up doing expansion of x either at the 1st occurrence (inside
__VA_OPT__) or at the 1st one _not_ inside __VA_OPT__ (the 3rd one in
in this example). So at parsing time we need to keep track of whether
we'd already seen an unconditional use of expanded form and similarly
for stringified one.
Another is that getting to the first __VA_OPT__ means that
we need to find out whether the expanded form of __VA_ARGS__ is empty.
If there'd been a prior expanding occurrence of __VA_ARGS__, we are
fine; if there hadn't, we need to make sure that unexpanded form
survives at least until that point.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 106 +++++++++++++++++++-
validation/preprocessor/va_opt.c | 54 ++++++++++
validation/preprocessor/va_opt2.c | 34 +++++++
validation/preprocessor/va_opt_whitespace.c | 14 +++
4 files changed, 204 insertions(+), 4 deletions(-)
create mode 100644 validation/preprocessor/va_opt.c
create mode 100644 validation/preprocessor/va_opt2.c
create mode 100644 validation/preprocessor/va_opt_whitespace.c
diff --git a/pre-process.c b/pre-process.c
index 0f0dbc56..eec0569c 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -419,6 +419,18 @@ static struct token *stringify(struct token *arg)
return token;
}
+static struct token *empty_string(const struct position *pos)
+{
+ struct token *token = __alloc_token(0);
+ static struct string empty = {.immutable = 1, .length = 1, .data = ""};
+
+ token->pos = *pos;
+ token_type(token) = TOKEN_STRING;
+ token->string = ∅
+ token->next = &eof_token_entry;
+ return token;
+}
+
/*
* Possibly valid combinations:
* - ident + ident -> ident
@@ -645,11 +657,28 @@ static bool is_end_va_opt(const struct token *token)
return eof_token(token->next);
}
+static bool skip_va_opt(struct arg *args, struct ident *expanding)
+{
+ struct token *arg = args[0].arg[ARG_NORMAL];
+ if (arg)
+ return eof_token(arg);
+ arg = args[0].arg[ARG_QUOTED];
+ if (!arg || eof_token(arg))
+ return true;
+ arg = dup_list(arg);
+ expanding->tainted = 0;
+ expand_list(&arg);
+ expanding->tainted = 1;
+ args[0].arg[ARG_NORMAL] = arg;
+ return eof_token(arg);
+}
+
static struct token **substitute(struct token **list, const struct token *body, struct arg *args)
{
struct position *base_pos = &(*list)->pos;
- enum {Normal, Placeholder, Concat} state = Normal;
+ enum {Normal, Placeholder, Concat} state = Normal, saved_state = Normal;
struct ident *expanding = (*list)->ident;
+ struct token **saved_list = NULL, *va_opt_list;
expanding->tainted = 1;
@@ -720,6 +749,47 @@ static struct token **substitute(struct token **list, const struct token *body,
state = Concat;
continue;
+ case TOKEN_VA_OPT:
+ // entering va_opt?
+ if (!is_end_va_opt(body)) {
+ if (skip_va_opt(args, expanding)) {
+ if (state == Concat)
+ state = Normal;
+ else
+ state = Placeholder;
+ continue;
+ }
+ body = body->va_opt_linkage;
+ continue;
+ }
+ body = body->va_opt_linkage;
+ // leaving va_opt?
+ if (token_type(body) == TOKEN_VA_OPT)
+ continue;
+ // leaving #va_opt
+ if (list == &va_opt_list) {
+ added = empty_string(base_pos);
+ } else {
+ *list = &eof_token_entry;
+ added = stringify(va_opt_list);
+ }
+ list = saved_list;
+ state = saved_state;
+ break;
+
+ case TOKEN_VA_OPT_STR:
+ // entering #va_opt
+ if (!skip_va_opt(args, expanding)) {
+ saved_state = state;
+ state = Normal;
+ saved_list = list;
+ list = &va_opt_list;
+ body = body->va_opt_linkage;
+ continue;
+ }
+ added = empty_string(base_pos);
+ break;
+
default:
added = dup_token(body, base_pos);
break;
@@ -1202,9 +1272,11 @@ struct arg_state {
struct token *needs_raw;
struct token *needs_expanded;
struct token *needs_str;
+ bool seen_uncond_expand;
+ bool seen_uncond_str;
};
-static bool in_va_opt;
+static bool in_va_opt, seen_va_opt;
static struct token **parse_body(struct token **list, struct arg_state args[]);
@@ -1221,6 +1293,23 @@ static int parse_va_opt(struct token *token, struct arg_state args[])
if (!match_op(next, '('))
goto Eunterminated;
+ if (!seen_va_opt) {
+ /*
+ * The first __VA_OPT__() will need an expanded __VA_ARGS__.
+ * if we had no prior expanded occurrences of __VA_ARGS__,
+ * we'll need its unexpanded form to survive until that point.
+ * Only the cannibalization of unexpended form needs to be
+ * prevented; cannibalization of expanded form doesn't matter.
+ * We only want to know if it's an empty list, i.e. equal to
+ * &eof_token_entry, and the pointer stored in struct args
+ * ->arg[ARG_NORMAL] doesn't change when we get to the last
+ * expanded occurrence of __VA_ARGS__ and consume the list
+ * it's pointing to.
+ */
+ if (!args[0].needs_expanded)
+ args[0].needs_raw = token;
+ seen_va_opt = true;
+ }
token_type(token) = TOKEN_VA_OPT;
token->va_opt_linkage = next;
next->next->pos.whitespace = token->pos.whitespace;
@@ -1292,13 +1381,19 @@ static void seen_arg(struct token *token, enum arg_kind kind, struct arg_state a
args[nr].needs_raw = token;
break;
case ARG_NORMAL:
- if (!args[nr].needs_expanded)
+ if (!args[nr].seen_uncond_expand &&
+ (!in_va_opt || !args[nr].needs_expanded)) {
+ args[nr].seen_uncond_expand = !in_va_opt;
args[nr].needs_raw = token;
+ }
args[nr].needs_expanded = token;
break;
default: // ARG_STR
- if (!args[nr].needs_str)
+ if (!args[nr].seen_uncond_str &&
+ (!in_va_opt || !args[nr].needs_str)) {
+ args[nr].seen_uncond_str = !in_va_opt;
args[nr].needs_raw = token;
+ }
args[nr].needs_str = token;
}
}
@@ -1436,6 +1531,7 @@ static struct token *parse_expansion(struct token *expansion, struct ident *name
struct token *token;
tail = parse_body(&expansion, args);
+ seen_va_opt = false;
if (!tail)
return NULL;
for (int i = 0; i < slots; i++) {
@@ -1445,6 +1541,8 @@ static struct token *parse_expansion(struct token *expansion, struct ident *name
args[i].needs_expanded->argnum |= 1 << ARGNUM_CONSUME;
if (args[i].needs_raw) {
struct token *p = args[i].needs_raw;
+ if (token_type(p) != TOKEN_MACRO_ARGUMENT)
+ continue;
if (argkind(p) == ARG_QUOTED)
p->argnum |= 1 << ARGNUM_CONSUME;
else if (argkind(p) == ARG_NORMAL)
diff --git a/validation/preprocessor/va_opt.c b/validation/preprocessor/va_opt.c
new file mode 100644
index 00000000..4fa38794
--- /dev/null
+++ b/validation/preprocessor/va_opt.c
@@ -0,0 +1,54 @@
+#define LPAREN() (
+#define G(Q) 42
+#define F(R, X, ...) __VA_OPT__(G R X) )
+int x = F(LPAREN(), 0, <:-); // replaced by int x = 42;
+#undef F
+#undef G
+#define F(...) f(0 __VA_OPT__(,) __VA_ARGS__)
+#define G(X, ...) f(0, X __VA_OPT__(,) __VA_ARGS__)
+#define SDEF(sname, ...) S sname __VA_OPT__(= { __VA_ARGS__ })
+#define EMP
+F(a, b, c) // replaced by f(0, a, b, c)
+F() // replaced by f(0)
+F(EMP) // replaced by f(0)
+G(a, b, c) // replaced by f(0, a, b, c)
+G(a, ) // replaced by f(0, a)
+G(a) // replaced by f(0, a)
+SDEF(foo); // replaced by S foo;
+SDEF(bar, 1, 2); // replaced by S bar = { 1, 2 };
+// may not appear at the beginning of a replacement
+// list (6.10.5.3)
+#define H2(X, Y, ...) __VA_OPT__(X ## Y,) __VA_ARGS__
+H2(a, b, c, d) // replaced by ab, c, d
+#define H3(X, ...) #__VA_OPT__(X##X X##X)
+H3(, 0) // replaced by ""
+#define H4(X, ...) __VA_OPT__(a X ## X) ## b
+H4(, 1) // replaced by a b
+#define H5A(...) __VA_OPT__()/**/__VA_OPT__()
+#define H5B(X) a ## X ## b
+#define H5C(X) H5B(X)
+H5C(H5A()) // replaced by ab
+/*
+ * check-name: __VA_OPT__ expansion (examples from C23)
+ * check-command: sparse -E $file
+ *
+ * check-output-start
+
+int x = 42;
+f(0 , a, b, c)
+f(0)
+f(0)
+f(0, a , b, c)
+f(0, a)
+f(0, a)
+S foo;
+S bar = { 1, 2 };
+ab, c, d
+""
+a b
+ab
+ * check-output-end
+ *
+ * check-error-start
+ * check-error-end
+ */
diff --git a/validation/preprocessor/va_opt2.c b/validation/preprocessor/va_opt2.c
new file mode 100644
index 00000000..5523301e
--- /dev/null
+++ b/validation/preprocessor/va_opt2.c
@@ -0,0 +1,34 @@
+#define B(X) 1
+// don't screw unexpanded __VA_ARGS__ on prior __VA_OPT__
+#define A(...) __VA_OPT__(1) A##__VA_ARGS__
+A(B(_))
+// tests for skipping __VA_OPT__ don't care if expanded __VA_ARGS__
+// has been already consumed
+#define C(...) [__VA_ARGS__ __VA_OPT__(1)]
+C(_)
+C()
+// don't cannibalize unexpanded __VA_ARGS__ too early
+#define E(X)
+#define D(...) A##__VA_ARGS__ R __VA_OPT__(1)
+D(E(_))
+// check that parser clears seen_va_opt on failure exit
+#define BAD(...) __VA_OPT__(,) #1
+#define F(...) A##__VA_ARGS__ R __VA_OPT__(1)
+F(E(_))
+/*
+ * check-name: __VA_ARGS__ cannibalization with __VA_OPT__
+ * check-command: sparse -E $file
+ *
+ * check-output-start
+
+1 AB(_)
+[_ 1]
+[]
+AE(_) R
+AE(_) R
+ * check-output-end
+ *
+ * check-error-start
+preprocessor/va_opt2.c:15:32: error: '#' is not followed by a macro parameter
+ * check-error-end
+ */
diff --git a/validation/preprocessor/va_opt_whitespace.c b/validation/preprocessor/va_opt_whitespace.c
new file mode 100644
index 00000000..727327f0
--- /dev/null
+++ b/validation/preprocessor/va_opt_whitespace.c
@@ -0,0 +1,14 @@
+#define A(X,...) [__VA_OPT__( X)][ __VA_OPT__(X)]
+A(1,_)
+/*
+ * check-name: __VA_OPT__ whitespace
+ * check-command: sparse -E $file
+ *
+ * check-output-start
+
+[1][ 1]
+ * check-output-end
+ *
+ * check-error-start
+ * check-error-end
+ */
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 18/21] merge(): saner handling of ->noexpand
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (15 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 17/21] expansion-time va_opt handling Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 19/21] simplify the calling conventions of collect_arguments() Al Viro
` (2 subsequent siblings)
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
We only care about noexpand for identifiers and solitary #; the latter
can't occur in merge(), the former should just get ->noexpand set
according to ->ident->tainted. That eliminates the last remaining
possibility of having expand() run into a token that has tainted
identifier - the regular noexpand check in the caller is sufficient now.
Should've done that all way back in 2004...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index eec0569c..352f02df 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -507,7 +507,7 @@ static int merge(struct token *left, struct token *right)
switch (res) {
case TOKEN_IDENT:
left->ident = built_in_ident(buffer);
- left->pos.noexpand = 0;
+ left->pos.noexpand = left->ident->tainted;
return 1;
case TOKEN_NUMBER:
@@ -529,13 +529,11 @@ static int merge(struct token *left, struct token *right)
case TOKEN_WIDE_CHAR:
case TOKEN_WIDE_STRING:
token_type(left) = res;
- left->pos.noexpand = 0;
left->string = right->string;
return 1;
case TOKEN_WIDE_CHAR_EMBEDDED_0 ... TOKEN_WIDE_CHAR_EMBEDDED_3:
token_type(left) = res;
- left->pos.noexpand = 0;
memcpy(left->embedded, right->embedded, 4);
return 1;
@@ -812,16 +810,10 @@ static int expand(struct token **list, struct symbol *sym)
{
struct token *last;
struct token *token = *list;
- struct ident *expanding = token->ident;
struct token **tail;
struct token *expansion = sym->expansion;
struct arg args[sym->fixed_args + 1];
- if (expanding->tainted) {
- token->pos.noexpand = 1;
- return 1;
- }
-
if (sym->arglist) {
if (!match_op(scan_next(&token->next), '('))
return 1;
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 19/21] simplify the calling conventions of collect_arguments()
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (16 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 18/21] merge(): saner handling of ->noexpand Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 20/21] make expand_one_symbol() inline Al Viro
2026-03-16 7:04 ` [PATCH 21/21] substitute(): convert switch() into cascade of ifs Al Viro
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Currently we call that only after having verified that macro name is
followed by the (, with those two tokens passed as separate arguments.
What's more, collect_arguments() already can tell the caller "don't
expand that" if the arguments are malformed, so there's no reason not to
move the check for opening parenthesis into collect_arguments() - that
makes the calling conventions simpler and it does not incur any cost -
collect_arguments() is going to be inlined into its sole caller anyway.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index 352f02df..73f4d615 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -307,16 +307,17 @@ struct arg {
struct token *arg[3];
};
-static int collect_arguments(struct token *start, struct symbol *sym, struct arg *args, struct token *what)
+static int collect_arguments(struct token *what, int fixed, bool vararg, struct arg *args)
{
- int fixed = sym->fixed_args;
- bool vararg = sym->vararg;
+ struct token *start = scan_next(&what->next);
struct token *next = NULL, *v = NULL;
const char *err;
int commas;
memset(args, 0, sizeof(struct arg) * (fixed + 1));
+ if (!match_op(start, '('))
+ return 0;
for (commas = 0; commas < fixed; commas++) {
next = collect_arg(start, false, &what->pos);
if (token_type(next) != TOKEN_SPECIAL)
@@ -355,7 +356,7 @@ Eexcess:
Eclosing:
err = "unterminated argument list invoking";
out:
- sparse_error(what->pos, "%s macro \"%s\"", err, show_ident(sym->ident));
+ sparse_error(what->pos, "%s macro \"%s\"", err, show_ident(what->ident));
what->next = next;
return 0;
}
@@ -808,23 +809,20 @@ static struct token **substitute(struct token **list, const struct token *body,
static int expand(struct token **list, struct symbol *sym)
{
- struct token *last;
+ struct token *next;
struct token *token = *list;
struct token **tail;
struct token *expansion = sym->expansion;
struct arg args[sym->fixed_args + 1];
- if (sym->arglist) {
- if (!match_op(scan_next(&token->next), '('))
- return 1;
- if (!collect_arguments(token->next, sym, args, token))
- return 1;
- }
+ if (sym->arglist &&
+ !collect_arguments(token, sym->fixed_args, sym->vararg, args))
+ return 1;
if (sym->expand)
return sym->expand(token, args) ? 0 : 1;
- last = token->next;
+ next = token->next;
tail = substitute(list, expansion, args);
/*
* Note that it won't be eof - at least TOKEN_UNTAINT will be there.
@@ -834,7 +832,7 @@ static int expand(struct token **list, struct symbol *sym)
*/
(*list)->pos.newline = token->pos.newline;
(*list)->pos.whitespace = token->pos.whitespace;
- *tail = last;
+ *tail = next;
return 0;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 20/21] make expand_one_symbol() inline
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (17 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 19/21] simplify the calling conventions of collect_arguments() Al Viro
@ 2026-03-16 7:04 ` Al Viro
2026-03-16 7:04 ` [PATCH 21/21] substitute(): convert switch() into cascade of ifs Al Viro
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
better code generation that way...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/pre-process.c b/pre-process.c
index 73f4d615..728feeb3 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -206,7 +206,7 @@ static void expand_include_level(struct token *token)
replace_with_integer(token, include_level - 1);
}
-static int expand_one_symbol(struct token **list)
+static inline int expand_one_symbol(struct token **list)
{
struct token *token = *list;
struct symbol *sym;
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 21/21] substitute(): convert switch() into cascade of ifs
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
` (18 preceding siblings ...)
2026-03-16 7:04 ` [PATCH 20/21] make expand_one_symbol() inline Al Viro
@ 2026-03-16 7:04 ` Al Viro
19 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-16 7:04 UTC (permalink / raw)
To: linux-sparse; +Cc: chriscli, torvalds, zxh, ben.dooks, dan.carpenter, rf
Again, better code generation that way (and I'd like to use likely()
here); it *is* in a very hot loop.
Reorder the TOKEN_... a bit (move TOKEN_UNTAINT up, so that it's less than
TOKEN_MACRO_ARGUMENT) to get the default (and by far the most common case)
via single comparison.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
pre-process.c | 68 +++++++++++++++++++++++----------------------------
token.h | 3 ++-
2 files changed, 33 insertions(+), 38 deletions(-)
diff --git a/pre-process.c b/pre-process.c
index 728feeb3..ea199a9a 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -682,34 +682,14 @@ static struct token **substitute(struct token **list, const struct token *body,
expanding->tainted = 1;
for (; !eof_token(body); body = body->next) {
- struct token *added, *arg;
- struct token **inserted_at;
- const struct token *t;
+ struct token *added;
- switch (token_type(body)) {
- case TOKEN_GNU_KLUDGE:
- /*
- * GNU kludge: if we had <comma>##<vararg>, behaviour
- * depends on whether we had enough arguments to have
- * a vararg. If we did, ## is just ignored. Otherwise
- * both , and ## are ignored. Worse, there can be
- * an arbitrary number of ##<arg> in between; if all of
- * those are empty, we act as if they hadn't been there,
- * otherwise we act as if the kludge didn't exist.
- */
- t = body;
- if (handle_kludge(&body, args)) {
- if (state == Concat)
- state = Normal;
- else
- state = Placeholder;
- continue;
- }
- added = dup_token(t, base_pos);
- token_type(added) = TOKEN_SPECIAL;
- break;
+ if (token_type(body) <= TOKEN_LAST_NORMAL) {
+ added = dup_token(body, base_pos);
+ } else if (token_type(body) == TOKEN_MACRO_ARGUMENT) {
+ struct token **inserted_at;
+ struct token *arg;
- case TOKEN_MACRO_ARGUMENT:
arg = do_argument(body, args, expanding);
if (!arg || eof_token(arg)) {
if (state == Concat)
@@ -740,15 +720,33 @@ static struct token **substitute(struct token **list, const struct token *body,
}
state = Normal;
continue;
-
- case TOKEN_CONCAT:
+ } else if (token_type(body) == TOKEN_CONCAT) {
if (state == Placeholder)
state = Normal;
else
state = Concat;
continue;
-
- case TOKEN_VA_OPT:
+ } else if (token_type(body) == TOKEN_GNU_KLUDGE) {
+ const struct token *t = body;
+ /*
+ * GNU kludge: if we had <comma>##<vararg>, behaviour
+ * depends on whether we had enough arguments to have
+ * a vararg. If we did, ## is just ignored. Otherwise
+ * both , and ## are ignored. Worse, there can be
+ * an arbitrary number of ##<arg> in between; if all of
+ * those are empty, we act as if they hadn't been there,
+ * otherwise we act as if the kludge didn't exist.
+ */
+ if (handle_kludge(&body, args)) {
+ if (state == Concat)
+ state = Normal;
+ else
+ state = Placeholder;
+ continue;
+ }
+ added = dup_token(t, base_pos);
+ token_type(added) = TOKEN_SPECIAL;
+ } else if (token_type(body) == TOKEN_VA_OPT) {
// entering va_opt?
if (!is_end_va_opt(body)) {
if (skip_va_opt(args, expanding)) {
@@ -774,9 +772,7 @@ static struct token **substitute(struct token **list, const struct token *body,
}
list = saved_list;
state = saved_state;
- break;
-
- case TOKEN_VA_OPT_STR:
+ } else if (token_type(body) == TOKEN_VA_OPT_STR) {
// entering #va_opt
if (!skip_va_opt(args, expanding)) {
saved_state = state;
@@ -787,10 +783,8 @@ static struct token **substitute(struct token **list, const struct token *body,
continue;
}
added = empty_string(base_pos);
- break;
-
- default:
- added = dup_token(body, base_pos);
+ } else {
+ sparse_error(body->pos, "bad token type(%d)", token_type(body));
break;
}
diff --git a/token.h b/token.h
index 3edf4ce1..5915d6a4 100644
--- a/token.h
+++ b/token.h
@@ -99,12 +99,13 @@ enum token_type {
TOKEN_SPECIAL,
TOKEN_STREAMBEGIN,
TOKEN_STREAMEND,
+ TOKEN_UNTAINT,
+ TOKEN_LAST_NORMAL = TOKEN_UNTAINT,
TOKEN_MACRO_ARGUMENT,
TOKEN_CONCAT,
TOKEN_GNU_KLUDGE,
TOKEN_VA_OPT,
TOKEN_VA_OPT_STR,
- TOKEN_UNTAINT,
TOKEN_IF,
TOKEN_SKIP_GROUPS,
TOKEN_ELSE,
--
2.47.3
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-03-16 6:56 ` Al Viro
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
@ 2026-03-16 16:42 ` Linus Torvalds
2026-03-19 3:53 ` Al Viro
2026-03-17 7:41 ` Chris Li
2026-03-18 6:35 ` Eric Zhang
3 siblings, 1 reply; 42+ messages in thread
From: Linus Torvalds @ 2026-03-16 16:42 UTC (permalink / raw)
To: Al Viro; +Cc: Eric Zhang, linux-sparse, dan.carpenter, chriscli, ben.dooks, rf
I have tested that branch on a few trivial cases, and it looks good to me.
I did write a long rant about how I hate cpp tricks and wish we had a
few simple extensions (__VA_COUNT__ would be the most simple one,
because COUNT_ARGS() is disgusting), but it is what it is, and this
makes things better. So I decided to just delete my rant.
Linus
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-03-16 6:56 ` Al Viro
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
2026-03-16 16:42 ` [RFC PATCH] pre-process: add __VA_OPT__ support Linus Torvalds
@ 2026-03-17 7:41 ` Chris Li
2026-03-18 6:35 ` Eric Zhang
3 siblings, 0 replies; 42+ messages in thread
From: Chris Li @ 2026-03-17 7:41 UTC (permalink / raw)
To: Al Viro
Cc: Eric Zhang, linux-sparse, dan.carpenter, chriscli, ben.dooks, rf,
torvalds
On Sun, Mar 15, 2026 at 11:53 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Feb 26, 2026 at 07:29:45AM +0000, Al Viro wrote:
> > On Wed, Feb 25, 2026 at 10:18:51PM +0000, Al Viro wrote:
> >
> > > NOTE: substitute() is the second hottest loop in the entire thing; only
> > > tokenizer is hotter. And gcc is too enthusiastic about the inlining
> > > around that function, ending up with bad register spills, along with
> > > a bunch of stalls. Worse, decisions are sensitive to minor changes in
> > > places textually far away, making it a real bitch to deal with.
> > > Makes for fun reordering the commits in local queue... ;-/
> >
> > FWIW, looking at that thing again, I wonder if we would be better off
> > with doing argument expansion on demand rather than doing it in
> > expand_arguments(). Should be doable with a bit of care - we'd need
> > to mark the TOKEN_..._ARG with several bits to decide whether we
> > want to duplicate or not, etc., but that's worth doing anyway -
> > better than playing with the counters.
> >
> > Note, BTW, that collapsing TOKEN_..._ARG together, with "kind of argument"
> > moved into bits stolen from ->argnum improves code generation - that
> > switch by token type is _hot_ and it reducing the number of cases
> > gives a measurable speedup. Sure, we don't want heavy work at #define
> > time - most of the macros are never expanded at all, but AFAICS this
> > kind of processing can be dealt with while parsing the body, with no
> > extra passes needed, etc.
> >
> > I'm going down right now, will look into that tomorrow morning...
>
> That turned out to be trickier than I hoped, but I've got something that
> works.
>
> See git://git.kernel.org/pub/scm/linux/kernel/git/viro/sparse.git #va_opt
> (or individual patches in followups)
Nice, I applied the patch at the sparse-dev repo:
https://git.kernel.org/pub/scm/devel/sparse/sparse-dev.git/
I will push to the stable repository and make a cut soon if no issues
are reported.
Chris
>
> __VA_OPT__ supported, AFAICS behaviour matches C23.
> * expansion and stringifying of arguments is full-lazy now -
> done on demand and at most once.
> * va-opt-replacement parsed at #define time, handled correctly
> by dump_macro() (i.e. -dM), comparisons when redefining and at expansion
> time.
> * arglist mangling is gone, so's the argcount kludge.
> * it's no slower than it used to be prior to that series.
>
> I have local followups (tentative fixes for whitespace handling in preprocessor
> and optimizations in tokenizer), but let's deal with that one first.
>
> Shortlog:
> Al Viro (21):
> split copy() into "need to copy" and "can move in place" cases
> expand and simplify the call of dup_token() in copy()
> more dup_token() optimizations
> parsing #define: saner handling of argument count, part 1
> simplify collect_arguments() and fix error handling there
> try_arg(): don't use arglist for argument name lookups
> make expand_has_...() responsible for expanding its argument
> preparing to change argument number encoding for TOKEN_..._ARGUMENT
> steal 2 bits from argnum for argument kind
> on-demand argument expansion
> kill create_arglist()
> stop mangling arglist, get rid of TOKEN_ARG_COUNT
> deal with ## on arguments separately
> preparations for __VA_OPT__ support: reshuffle argument slot assignments
> pre-process.c: split try_arg()
> __VA_OPT__: parsing
> expansion-time va_opt handling
> merge(): saner handling of ->noexpand
> simplify the calling conventions of collect_arguments()
> make expand_one_symbol() inline
> substitute(): convert switch() into cascade of ifs
>
> Diffstat:
> ident-list.h | 1 +
> pre-process.c | 929 +++++++++++++++++-----------
> symbol.h | 1 +
> token.h | 32 +-
> tokenize.c | 4 -
> validation/preprocessor/bad-args.c | 18 +
> validation/preprocessor/dump-macro.c | 13 +
> validation/preprocessor/has-attribute.c | 3 +
> validation/preprocessor/has-builtin.c | 3 +
> validation/preprocessor/va_opt.c | 54 ++
> validation/preprocessor/va_opt2.c | 34 +
> validation/preprocessor/va_opt_compare.c | 28 +
> validation/preprocessor/va_opt_parse.c | 37 ++
> validation/preprocessor/va_opt_whitespace.c | 14 +
> 14 files changed, 797 insertions(+), 374 deletions(-)
> create mode 100644 validation/preprocessor/bad-args.c
> create mode 100644 validation/preprocessor/dump-macro.c
> create mode 100644 validation/preprocessor/va_opt.c
> create mode 100644 validation/preprocessor/va_opt2.c
> create mode 100644 validation/preprocessor/va_opt_compare.c
> create mode 100644 validation/preprocessor/va_opt_parse.c
> create mode 100644 validation/preprocessor/va_opt_whitespace.c
>
>
> PS: as for the interesting uses of __VA_OPT__, consider this:
> ; cat >test.c <<'EOF'
> // based on a fun trick from David Mazières
> // see https://www.scs.stanford.edu/~dm/blog/va-opt.html for the entire story
> // No, it's not unbounded recursion - up to 256 (4^4) elements in __VA_ARGS__;
> // more with trivial modifications, just add more levels to EXPAND...
> #define PARENS ()
> #define EXPAND(...) EXPAND4(EXPAND4(EXPAND4(EXPAND4(__VA_ARGS__))))
> #define EXPAND4(...) EXPAND3(EXPAND3(EXPAND3(EXPAND3(__VA_ARGS__))))
> #define EXPAND3(...) EXPAND2(EXPAND2(EXPAND2(EXPAND2(__VA_ARGS__))))
> #define EXPAND2(...) EXPAND1(EXPAND1(EXPAND1(EXPAND1(__VA_ARGS__))))
> #define EXPAND1(...) __VA_ARGS__
> #define FOR_EACH_PAIR(macro, ...) \
> __VA_OPT__(EXPAND(FOR_EACH_PAIR_HELPER(macro, __VA_ARGS__)))
> #define FOR_EACH_PAIR_HELPER(macro, a1, a2, ...) \
> macro(a1, a2) \
> __VA_OPT__(FOR_EACH_PAIR_AGAIN PARENS (macro, __VA_ARGS__))
> #define FOR_EACH_PAIR_AGAIN() FOR_EACH_PAIR_HELPER
>
> FOR_EACH_PAIR(F, t1, id1, t2, id2, t3, id3, t4, id4, t5, id5, t6, id6)
> EOF
> ; cpp -E test.c
> # 0 "test.c"
> # 0 "<built-in>"
> # 0 "<command-line>"
> # 1 "/usr/include/stdc-predef.h" 1 3 4
> # 0 "<command-line>" 2
> # 1 "test.c"
> # 18 "test.c"
> F(t1, id1) F(t2, id2) F(t3, id3) F(t4, id4) F(t5, id5) F(t6, id6)
> ;
>
> and the same output from sparse, modulo the # ... lines - sparse -E doesn't
> produce those. Our (fairly brittle) analogue is __MAP in linux/syscalls.h
> and if nothing else, unlike __MAP() this thing does not need the number
> of pairs passed as explicit argument. Would be interesting to try unifying
> SYSCALL0..SYSCALL6 into a single macro that would bloody well _count_ the
> arguments...
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-03-16 6:56 ` Al Viro
` (2 preceding siblings ...)
2026-03-17 7:41 ` Chris Li
@ 2026-03-18 6:35 ` Eric Zhang
3 siblings, 0 replies; 42+ messages in thread
From: Eric Zhang @ 2026-03-18 6:35 UTC (permalink / raw)
To: Al Viro; +Cc: linux-sparse, sparse, torvalds, dan.carpenter, Eric Zhang
---
On Mon, Mar 16, 2026 at 06:56:22AM +0000, Al Viro wrote:
> That turned out to be trickier than I hoped, but I've got something that
> works.
Nice! It is much cleaner than what I had in my RFC :) I've tested the branch
against tests cases in my previous RFC and some new tests, no regression.
Tested-by: Eric Zhang <zxh@xh-zhang.com>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-03-16 16:42 ` [RFC PATCH] pre-process: add __VA_OPT__ support Linus Torvalds
@ 2026-03-19 3:53 ` Al Viro
2026-03-19 4:07 ` Linus Torvalds
0 siblings, 1 reply; 42+ messages in thread
From: Al Viro @ 2026-03-19 3:53 UTC (permalink / raw)
To: Linus Torvalds
Cc: Eric Zhang, linux-sparse, dan.carpenter, chriscli, ben.dooks, rf
On Mon, Mar 16, 2026 at 09:42:01AM -0700, Linus Torvalds wrote:
> I have tested that branch on a few trivial cases, and it looks good to me.
>
> I did write a long rant about how I hate cpp tricks and wish we had a
> few simple extensions (__VA_COUNT__ would be the most simple one,
> because COUNT_ARGS() is disgusting), but it is what it is, and this
> makes things better. So I decided to just delete my rant.
Speaking of rants: gcc code generation in general and around bitfields.
Example: in collect_arg() we have
next->pos.stream = pos->stream;
next->pos.line = pos->line;
next->pos.pos = pos->pos;
next->pos.newline = 0;
in a loop, pos is declared const struct position *. Generates the
following horror:
movzwl 2(%r12), %edx
movl (%r12), %ecx
leaq 8(%rax), %rbp
shrw $6, %dx
andl $1048512, %ecx
movzwl %dx, %edx
salq $22, %rdx
orq %rcx, %rdx
movl 4(%r12), %ecx
andl $2147483647, %ecx
salq $32, %rcx
orq %rcx, %rdx
movq (%rax), %rcx
andq %r14, %rcx
orq %rcx, %rdx
movq %rdx, (%rax)
r12 is 'pos', rax - 'next', r14 comes from
movabsq $-9223372036852678593, %r14
in the beginning of the function (0x800000000020003f). Note that
*everything* prior to the last 4 insns is equivalent to
rbp = &(struct token *)rax->next
rdx = *(u64 *)r12 & 0x7fffffffffc0xfffc0
written in a really convoluted way.
OK, so it doesn't figure out it could bloody well calculate that rdx
value once and store in some register (the same r12, for that matter).
Let's make it simple for the damn thing - pass struct position instead
of struct position *; what we get is
movq %r12, %rdx
leaq 8(%rax), %rbp
movabsq $9223372032559808512, %rcx
andl $1048512, %edx
andq %r12, %rcx
orq %r14, %rdx
orq %rcx, %rdx
movabsq $-9223372036852678593, %rcx
andq (%rax), %rcx
orq %rcx, %rdx
movq %rdx, (%rax)
What the hell? No, really - it's
rdx = r12;
rbp = &(struct token *)rax->next;
rcx = 0x7fffffff00000000;
rdx &= 0xfffc0;
rcx &= r12;
rdx |= rcx;
followed by
rcx = 0x800000000020003f & *(u64 *)rax;
rdx |= rcx;
*(u64 *)rax = rdx;
Leaving aside the utility of repeating the same calculation on each
iteration of the loop, figuring out that
(0x7fffffff00000000 & r12) | (0xfffc0 & r12)
is equal to
0x7fffffffffc0fffc0 & r12
ought to be within the abilities of the damn compiler - and it *is*
loading a 64bit constant into rcx as it is. What's more, it's not
a preference to using 64bit constants with lower 32 bits clear -
another movabsq in the same chunk is not of that form.
Perhaps it's an explicit store of 0 to ->newline that does it?
Clearing pos.newline in the beginning and have
next->pos.newline = pos.newline;
instead of zeroing in the loop does not change anything (other than
worse register allocation). Moving zeroing of pos.newline into the
caller finally gets that calculation out of loop... and messes with
the register allocation in the caller, which is inlined into expand(),
along with substitute(), do_argument() and quite a few other things.
Granted, collect_arg() is not particularly hot, but... ouch.
I really, really don't like the handling of bitfields ;-/
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-03-19 3:53 ` Al Viro
@ 2026-03-19 4:07 ` Linus Torvalds
2026-03-19 5:34 ` Al Viro
0 siblings, 1 reply; 42+ messages in thread
From: Linus Torvalds @ 2026-03-19 4:07 UTC (permalink / raw)
To: Al Viro; +Cc: Eric Zhang, linux-sparse, dan.carpenter, chriscli, ben.dooks, rf
On Wed, 18 Mar 2026 at 20:50, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Speaking of rants: gcc code generation in general and around bitfields.
Yeah. gcc handling of bitfields is a disaster. I made a gcc bugzilla
about some of this many years ago (almost exactly 15 years ago):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696
and things haven't really improved much since, afaik.
I suggested at the time that gcc should stop doing bitfields
internally, and translate it all into shifts and masks and then doing
all the normal optimizations on those. But that's not what it does -
gcc keeps it as a "bitfield access" way too long in the optimization
phases, and then does a really bad job of optimizing it all.
And yes, I made that bugzilla entry due to sparse, and that 'struct pos'.
Because it really triggers all kinds of horrendous gcc behavior.
I looked at clang at one point, and iirc it generated *much* better
code, because I think it does tth esmart thing, which is to get rid of
the notion of bitfields as quickly as possible, and then doing just
regular integer optimizations.
(And the real problem with gcc is really that "byte write followed by
word read", which is horribly horribly expensive because it causes a
pipeline stall. So even when the code looks small, it's really really
bad. Clang actually generates more instructions for the test-case in
that bugzilla, but the code is ten times faster because it doesn't
stall the pipeline)
Linus
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [RFC PATCH] pre-process: add __VA_OPT__ support
2026-03-19 4:07 ` Linus Torvalds
@ 2026-03-19 5:34 ` Al Viro
0 siblings, 0 replies; 42+ messages in thread
From: Al Viro @ 2026-03-19 5:34 UTC (permalink / raw)
To: Linus Torvalds
Cc: Eric Zhang, linux-sparse, dan.carpenter, chriscli, ben.dooks, rf
On Wed, Mar 18, 2026 at 09:07:17PM -0700, Linus Torvalds wrote:
> I looked at clang at one point, and iirc it generated *much* better
> code, because I think it does tth esmart thing, which is to get rid of
> the notion of bitfields as quickly as possible, and then doing just
> regular integer optimizations.
FWIW, with the local tokenizer patches the single worst spot at the moment
(both for clang and for gcc build) is this:
p = &hash_table[hash];
while ((ident = *p) != NULL) {
if (ident->len == (unsigned char) len) {
in create_hashed_ident(). This is from clang build, where
it's inlined into tokenize_stream():
0.14 │ mov (%rdx,%rcx,8),%rax
12.34 │ mov %r12d,%r15d
0.02 │ test %rax,%rax
0.43 │ ↓ je 33e
│ mov %rsp,%r14
0.29 │ ↓ jmp 319
│ nop
│310:┌─→mov 0x0(%r13),%rax
2.00 │ │ test %rax,%rax
0.07 │ │↓ je 345
0.00 │319:│ mov %rax,%r13
│ │if (ident->len == (unsigned char) len) {
0.19 │ ├──cmp %r12b,0x10(%rax)
16.86 │ └──jne 310
and this is gcc build, where it's not inlined, so the percentages are
several times higher (out of 5.9% vs. out of 17.1% on the profiles I'm
looking at):
│ p = &hash_table[hash]; ▒
0.77 │ mov (%rax,%rdx,8),%rbx ▒
│ while ((ident = *p) != NULL) { ▒
27.77 │ test %rbx,%rbx ◆
0.98 │ ↓ jne 3b ▒
│ ↓ jmp c2 ▒
│ nop ▒
│ ident_hit++; ▒
│ return ident; ▒
│ } ▒
│ next: ▒
│ //misses++; ▒
│ p = &ident->next; ▒
0.00 │30:┌─→mov (%rbx),%rax ▒
│ │while ((ident = *p) != NULL) { ▒
6.11 │ │ test %rax,%rax ▒
0.24 │ │↓ je 70 ▒
│ │ mov %rax,%rbx ▒
│ │if (ident->len == (unsigned char) len) { ▒
0.81 │3b:├──cmp %bpl,0x10(%rbx) ▒
50.82 │ └──jne 30 ▒
Most of the accesses are to single-element chain; it's not walking the
lists that hurts, it's the very first step. The profiles are for userland
cycles; looking for stalled-cycles-frontend gives exact same hotspots.
The next one is lookup_symbol(); there we also walk linked lists.
The only difference is that lists are often longer than one entry...
Not sure what can be done about either.
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2026-03-19 5:31 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <cover.1771930766.git.dan.carpenter@linaro.org>
2026-02-24 11:07 ` [PATCH] sparse: add support for __VA_OPT__ Dan Carpenter
2026-02-24 11:16 ` Ben Dooks
2026-02-24 11:56 ` Dan Carpenter
2026-02-24 12:42 ` Richard Fitzgerald
2026-02-24 13:15 ` Ben Dooks
2026-02-25 2:39 ` Chris Li
2026-02-25 3:36 ` Al Viro
2026-02-25 5:29 ` [RFC PATCH] pre-process: add __VA_OPT__ support Eric Zhang
2026-02-25 6:40 ` Al Viro
2026-02-25 7:27 ` Al Viro
2026-02-25 8:14 ` Eric Zhang
2026-02-25 22:18 ` Al Viro
2026-02-26 7:29 ` Al Viro
2026-03-16 6:56 ` Al Viro
2026-03-16 7:03 ` [PATCH 01/21] split copy() into "need to copy" and "can move in place" cases Al Viro
2026-03-16 7:03 ` [PATCH 02/21] expand and simplify the call of dup_token() in copy() Al Viro
2026-03-16 7:03 ` [PATCH 03/21] more dup_token() optimizations Al Viro
2026-03-16 7:03 ` [PATCH 04/21] parsing #define: saner handling of argument count, part 1 Al Viro
2026-03-16 7:03 ` [PATCH 05/21] simplify collect_arguments() and fix error handling there Al Viro
2026-03-16 7:04 ` [PATCH 06/21] try_arg(): don't use arglist for argument name lookups Al Viro
2026-03-16 7:04 ` [PATCH 07/21] make expand_has_...() responsible for expanding its argument Al Viro
2026-03-16 7:04 ` [PATCH 08/21] preparing to change argument number encoding for TOKEN_..._ARGUMENT Al Viro
2026-03-16 7:04 ` [PATCH 09/21] steal 2 bits from argnum for argument kind Al Viro
2026-03-16 7:04 ` [PATCH 10/21] on-demand argument expansion Al Viro
2026-03-16 7:04 ` [PATCH 11/21] kill create_arglist() Al Viro
2026-03-16 7:04 ` [PATCH 12/21] stop mangling arglist, get rid of TOKEN_ARG_COUNT Al Viro
2026-03-16 7:04 ` [PATCH 13/21] deal with ## on arguments separately Al Viro
2026-03-16 7:04 ` [PATCH 14/21] preparations for __VA_OPT__ support: reshuffle argument slot assignments Al Viro
2026-03-16 7:04 ` [PATCH 15/21] pre-process.c: split try_arg() Al Viro
2026-03-16 7:04 ` [PATCH 16/21] __VA_OPT__: parsing Al Viro
2026-03-16 7:04 ` [PATCH 17/21] expansion-time va_opt handling Al Viro
2026-03-16 7:04 ` [PATCH 18/21] merge(): saner handling of ->noexpand Al Viro
2026-03-16 7:04 ` [PATCH 19/21] simplify the calling conventions of collect_arguments() Al Viro
2026-03-16 7:04 ` [PATCH 20/21] make expand_one_symbol() inline Al Viro
2026-03-16 7:04 ` [PATCH 21/21] substitute(): convert switch() into cascade of ifs Al Viro
2026-03-16 16:42 ` [RFC PATCH] pre-process: add __VA_OPT__ support Linus Torvalds
2026-03-19 3:53 ` Al Viro
2026-03-19 4:07 ` Linus Torvalds
2026-03-19 5:34 ` Al Viro
2026-03-17 7:41 ` Chris Li
2026-03-18 6:35 ` Eric Zhang
2026-02-25 7:05 ` [PATCH] sparse: add support for __VA_OPT__ Chris Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox