linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fun with declarations and definitions
@ 2009-02-02  7:30 Al Viro
  2009-02-02 20:17 ` Christopher Li
  2009-02-03  3:07 ` Christopher Li
  0 siblings, 2 replies; 18+ messages in thread
From: Al Viro @ 2009-02-02  7:30 UTC (permalink / raw)
  To: linux-sparse

	There are several interesting problems caused by the fact that
we create a separate symbol for each declaration of given function.

1)

static inline int f(void);
static int g(void)
{
	return f();
}
static inline int f(void)
{
	return 0;
}
gives an error, since the instance of f in g is not associated with anything
useful.  Needless to say, this is a perfectly valid C.  Moreover,
static inline int f(void)
{
	return 0;
}
static inline int f(void);
static int g(void)
{
	return f();
}
will step on the same thing.  Currently we get the former case all over the
place in the kernel, thanks to the way DEFINE_SYSCALLx() is done.

I have a kinda-sorta fix for that (basically, add a reference to external
definition to struct symbol and update it correctly - it's not hard).
However, that doesn't cover *another* weirdness in the same area -
gccisms around extern inline.  There we can have inline and external
definitions in the same translation unit (and they can be different,
to make the things even more interesting).  Anyway, that's a separate
story - as it is, we don't even have a way to tell 'extern inline ...'
from 'inline ...'

2) More fun in the same area: checks for SYM_FN in external_declaration()
do not take into account the possibility of
	void f(int);
	typeof(f) g;
Ergo, we get linkage-less function declarations.  Fun, innit?  No patch.

3) Better yet, sparse does _NOT_ reject
	typeof(f) g
	{
		...
	}
which is obviously a Bloody Bad Idea(tm) (just think what that does to
argument list).  Similar crap is triggerable with typedef.  IMO, we really
ought to reject _that_ - not only 6.9.1(2) explicitly requires that, but
there's no even remotely sane way to deal with arguments.

4)
	static void f(void);
	...
	void f(void);
triggers "warning: symbol 'f' was not declared. Should it be static?"
which is at least very confusing - it *is* declared and it *is* static.
IOW, we do not collect the linkage information sanely.  (2) will make
fixing that one very interesting.

Anyway, proposed patch for (1) follows:

Subject: [PATCH] Handle mix of declarations and definitions

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 evaluate.c                 |    7 +++++++
 parse.c                    |   16 ++++++++++++++++
 symbol.h                   |    1 +
 validation/context-named.c |    2 ++
 validation/definitions.c   |   12 ++++++++++++
 5 files changed, 38 insertions(+), 0 deletions(-)
 create mode 100644 validation/definitions.c

diff --git a/evaluate.c b/evaluate.c
index f976645..5ed31a9 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -2747,6 +2747,10 @@ static int evaluate_symbol_call(struct expression *expr)
 	if (ctype->ctype.modifiers & MOD_INLINE) {
 		int ret;
 		struct symbol *curr = current_fn;
+
+		if (ctype->definition)
+			ctype = ctype->definition;
+
 		current_fn = ctype->ctype.base_type;
 
 		ret = inline_function(expr, ctype);
@@ -3052,6 +3056,9 @@ static struct symbol *evaluate_symbol(struct symbol *sym)
 	if (base_type->type == SYM_FN) {
 		struct symbol *curr = current_fn;
 
+		if (sym->definition && sym->definition != sym)
+			return evaluate_symbol(sym->definition);
+
 		current_fn = base_type;
 
 		examine_fn_arguments(base_type);
diff --git a/parse.c b/parse.c
index eb31871..3cdcf63 100644
--- a/parse.c
+++ b/parse.c
@@ -2105,6 +2105,7 @@ static struct token *parse_function_body(struct token *token, struct symbol *dec
 	struct symbol_list **old_symbol_list;
 	struct symbol *base_type = decl->ctype.base_type;
 	struct statement *stmt, **p;
+	struct symbol *prev;
 	struct symbol *arg;
 
 	old_symbol_list = function_symbol_list;
@@ -2138,6 +2139,18 @@ static struct token *parse_function_body(struct token *token, struct symbol *dec
 	if (!(decl->ctype.modifiers & MOD_INLINE))
 		add_symbol(list, decl);
 	check_declaration(decl);
+	decl->definition = decl;
+	prev = decl->same_symbol;
+	if (prev && prev->definition) {
+		warning(decl->pos, "multiple definitions for function '%s'",
+			show_ident(decl->ident));
+		info(prev->definition->pos, " the previous one is here");
+	} else {
+		while (prev) {
+			prev->definition = decl;
+			prev = prev->same_symbol;
+		}
+	}
 	function_symbol_list = old_symbol_list;
 	if (function_computed_goto_list) {
 		if (!function_computed_target_list)
@@ -2282,6 +2295,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 			}
 		}
 	} else if (base_type && base_type->type == SYM_FN) {
+		if (decl->next_id && decl->next_id->scope == decl->scope)
 		/* K&R argument declaration? */
 		if (lookup_type(token))
 			return parse_k_r_arguments(token, decl, list);
@@ -2309,6 +2323,8 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 			}
 		}
 		check_declaration(decl);
+		if (decl->same_symbol)
+			decl->definition = decl->same_symbol->definition;
 
 		if (!match_op(token, ','))
 			break;
diff --git a/symbol.h b/symbol.h
index c4d7f28..e5369d2 100644
--- a/symbol.h
+++ b/symbol.h
@@ -155,6 +155,7 @@ struct symbol {
 			struct expression *initializer;
 			struct entrypoint *ep;
 			long long value;		/* Initial value */
+			struct symbol *definition;
 		};
 	};
 	union /* backend */ {
diff --git a/validation/context-named.c b/validation/context-named.c
index 58310e9..a896621 100644
--- a/validation/context-named.c
+++ b/validation/context-named.c
@@ -501,6 +501,8 @@ static void good_mixed_with_if(void)
  * check-name: Check -Wcontext with lock names
  *
  * check-error-start
+context-named.c:417:13: warning: multiple definitions for function 'good_fn2'
+context-named.c:408:13:  the previous one is here
 context-named.c:86:3: warning: context imbalance in 'warn_lock1': wrong count at exit
 context-named.c:86:3:    context 'TEST': wanted 0, got 1
 context-named.c:93:3: warning: context imbalance in 'warn_lock2': wrong count at exit
diff --git a/validation/definitions.c b/validation/definitions.c
new file mode 100644
index 0000000..fce7393
--- /dev/null
+++ b/validation/definitions.c
@@ -0,0 +1,12 @@
+static inline int f(void);
+static int g(void)
+{
+        return f();
+}
+static inline int f(void)
+{
+	return 0;
+}
+/*
+ * check-name: finding definitions
+ */
-- 
1.5.6.6


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-02  7:30 fun with declarations and definitions Al Viro
@ 2009-02-02 20:17 ` Christopher Li
  2009-02-02 20:58   ` Al Viro
  2009-02-03  3:07 ` Christopher Li
  1 sibling, 1 reply; 18+ messages in thread
From: Christopher Li @ 2009-02-02 20:17 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-sparse

On Sun, Feb 1, 2009 at 11:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>        There are several interesting problems caused by the fact that
> we create a separate symbol for each declaration of given function.

Thanks for the patch. That is great. This is actually one of the two
hard problem in
sparse. I haven't able to solved them. (BTW, the other one was running
out of modifier bits.)

You patch seems base on Josh's tree. In my tree the context change has been
back out, so the  validation/context-named.c is gone. Other than that, the patch
seems apply fine. I will take a closer look at it and get back to you.

Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-02 20:17 ` Christopher Li
@ 2009-02-02 20:58   ` Al Viro
  2009-02-02 22:25     ` Christopher Li
  0 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2009-02-02 20:58 UTC (permalink / raw)
  To: Christopher Li; +Cc: linux-sparse

On Mon, Feb 02, 2009 at 12:17:06PM -0800, Christopher Li wrote:
> On Sun, Feb 1, 2009 at 11:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >        There are several interesting problems caused by the fact that
> > we create a separate symbol for each declaration of given function.
> 
> Thanks for the patch. That is great. This is actually one of the two
> hard problem in
> sparse. I haven't able to solved them. (BTW, the other one was running
> out of modifier bits.)

Modifier bits are going to get easier - I have a patch series that
takes a bunch out (basically, to hell with everything in MOD_SPECIFIER -
the only hard part is on the parser side and I've got a saner way to
deal with that).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-02 20:58   ` Al Viro
@ 2009-02-02 22:25     ` Christopher Li
  0 siblings, 0 replies; 18+ messages in thread
From: Christopher Li @ 2009-02-02 22:25 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-sparse

On Mon, Feb 2, 2009 at 12:58 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> Modifier bits are going to get easier - I have a patch series that
> takes a bunch out (basically, to hell with everything in MOD_SPECIFIER -
> the only hard part is on the parser side and I've got a saner way to
> deal with that).

That sounds like an interesting patch. Cared to shared it? Even it is not
quite ready, I would like to take a look and maybe I can learn some thing.

I want to store more than a few bits attribute information. So I was hacking a
patch to make an separate structure for extended attributes.
Only the symbol that use those extended attribute will have it.
The address space and context will move to the extend attributes
as well. Most of the symbol don't have special attribute, so it will save up
some space overall. Does it sound like some thing sane to do?

Any way, I haven't able to get it to work. The big messy part is when sparse
propagate those attributes, some times it need to keep a separate copy of
the extended attributes, because this extended attribute is option for symbols.
Maybe I should try it again.

Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-02  7:30 fun with declarations and definitions Al Viro
  2009-02-02 20:17 ` Christopher Li
@ 2009-02-03  3:07 ` Christopher Li
  2009-02-03  4:13   ` Al Viro
  2009-02-03  4:41   ` Al Viro
  1 sibling, 2 replies; 18+ messages in thread
From: Christopher Li @ 2009-02-03  3:07 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-sparse

On Sun, Feb 1, 2009 at 11:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> Anyway, proposed patch for (1) follows:

I read the patch, seems reasonable. It is only solve the inline case though.
The more generic problem still exist, symbol look up between partial
prototype declare and the real declare will get symbol with partial
information.

It would be great if we can apply the definition to the first symbol. But
I can haven't find a clean way to do it yet.

Most likely I will apply your patch until we find a better way to do it.

> @@ -2282,6 +2295,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
>                        }
>                }
>        } else if (base_type && base_type->type == SYM_FN) {
> +               if (decl->next_id && decl->next_id->scope == decl->scope)

Not sure what this is trying to do. Shouldn't the next line to be indented
if I read it correctly?

Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-03  3:07 ` Christopher Li
@ 2009-02-03  4:13   ` Al Viro
  2009-02-05 18:40     ` Christopher Li
  2009-02-03  4:41   ` Al Viro
  1 sibling, 1 reply; 18+ messages in thread
From: Al Viro @ 2009-02-03  4:13 UTC (permalink / raw)
  To: Christopher Li; +Cc: linux-sparse

On Mon, Feb 02, 2009 at 07:07:24PM -0800, Christopher Li wrote:
> On Sun, Feb 1, 2009 at 11:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > Anyway, proposed patch for (1) follows:
> 
> I read the patch, seems reasonable. It is only solve the inline case though.
> The more generic problem still exist, symbol look up between partial
> prototype declare and the real declare will get symbol with partial
> information.

... and unfortunately, that's what we _have_ to do.  Reason: behaviour
of typeof().  Example:

extern int a[];
int a[__builtin_types_compatible_p(typeof(a),int [3]) + 3];	/* 4 */
int b[__builtin_types_compatible_p(typeof(a),int [3]) + 3];	/* 3 */

Similar for
extern int a[];		/* a is int [] */
typedef typeof(a) A;	/* A is int []  _AND_ _WILL_ _REMAIN_ _SO_ */
int a[10];		/* a is int [10] now */
A b;			/* int [] */
int b[1];		/* no problem */

Similar applies for functions getting pointers to functions, etc. - having
the type refined by subsequent declaration is not retroactive.

Mind you, we are not doing composite types in any useful way and _that_ is
where we ought to change things.  Subsequent declarations should pick
the additional type information; we do propagate the inline definition
back to the call sites, provided that we had the function declared inline
before those.  However, the type information should _not_ be spread back.

And it's not just due to typeof (I suspect that honest attempt to define
its semantics in presense of back-propagation like that will end up to
be Turing-complete, but even if it is decidable it's prohibitively hard).
Note that we do have similar effects in standard C - e.g.

extern int a[];
void f(void)
{
	extern int a[24];
	memset(a, 0, sizeof(a)); /* OK */
}
void g(void)
{
	memset(a, 0, sizeof(a)); /* error - sizeof(a) is hidden from us here */
}

doesn't involve any extensions and having it barf on the second memset call
there is certainly intended behaviour.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-03  3:07 ` Christopher Li
  2009-02-03  4:13   ` Al Viro
@ 2009-02-03  4:41   ` Al Viro
  2009-02-03  6:28     ` Ralf Wildenhues
  2009-02-05 18:52     ` Christopher Li
  1 sibling, 2 replies; 18+ messages in thread
From: Al Viro @ 2009-02-03  4:41 UTC (permalink / raw)
  To: Christopher Li; +Cc: linux-sparse

On Mon, Feb 02, 2009 at 07:07:24PM -0800, Christopher Li wrote:
> >        } else if (base_type && base_type->type == SYM_FN) {
> > +               if (decl->next_id && decl->next_id->scope == decl->scope)
> 
> Not sure what this is trying to do. Shouldn't the next line to be indented
> if I read it correctly?

Yuck.   That's a leftover that hadn't been caught since it affects only
K&R definitions.  Kill that line...

ObDeclarationParsing: I'm sorely (_very_ sorely) tempted to claim that
gcc folks are violating GPL.  Proof: gcc/c-parse.in, around productions
for declspecs_*.  Of course, they can say that this _is_ the preferred
form for making modifications, but having such admission made in public
would be worth it...  Trying to sort out the __attribute__ handling in
there just plain hurts ;-/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-03  4:41   ` Al Viro
@ 2009-02-03  6:28     ` Ralf Wildenhues
  2009-02-05 18:52     ` Christopher Li
  1 sibling, 0 replies; 18+ messages in thread
From: Ralf Wildenhues @ 2009-02-03  6:28 UTC (permalink / raw)
  To: Al Viro; +Cc: Christopher Li, linux-sparse

Hello,

* Al Viro wrote on Tue, Feb 03, 2009 at 05:41:56AM CET:
> ObDeclarationParsing: I'm sorely (_very_ sorely) tempted to claim that
> gcc folks are violating GPL.  Proof: gcc/c-parse.in, around productions
> for declspecs_*.  Of course, they can say that this _is_ the preferred
> form for making modifications, [...]

FWIW, that has since been rewritten.  Quoting gcc/ChangeLog-2005:

| 2005-02-25  Joseph S. Myers  <joseph@codesourcery.com>
| 
|         * c-parser.c: New file.
|         * c-parse.in: Remove.
[...]

Cheers,
Ralf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-03  4:13   ` Al Viro
@ 2009-02-05 18:40     ` Christopher Li
  2009-02-05 18:47       ` Derek M Jones
  0 siblings, 1 reply; 18+ messages in thread
From: Christopher Li @ 2009-02-05 18:40 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-sparse

On Mon, Feb 2, 2009 at 8:13 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> I read the patch, seems reasonable. It is only solve the inline case though.
>> The more generic problem still exist, symbol look up between partial
>> prototype declare and the real declare will get symbol with partial
>> information.
>
> ... and unfortunately, that's what we _have_ to do.  Reason: behaviour
> of typeof().  Example:
>
> extern int a[];
> int a[__builtin_types_compatible_p(typeof(a),int [3]) + 3];     /* 4 */
> int b[__builtin_types_compatible_p(typeof(a),int [3]) + 3];     /* 3 */
>
> Similar for
> extern int a[];         /* a is int [] */
> typedef typeof(a) A;    /* A is int []  _AND_ _WILL_ _REMAIN_ _SO_ */
> int a[10];              /* a is int [10] now */
> A b;                    /* int [] */
> int b[1];               /* no problem */
>
> Similar applies for functions getting pointers to functions, etc. - having
> the type refined by subsequent declaration is not retroactive.

I see. So the subsequent declaration for the same symbol should inherent
previous declaration attributes, but no the other way around.

> Mind you, we are not doing composite types in any useful way and _that_ is
> where we ought to change things.  Subsequent declarations should pick
> the additional type information; we do propagate the inline definition
> back to the call sites, provided that we had the function declared inline
> before those.  However, the type information should _not_ be spread back.

Right. I think currently sparse treat the subsequent declaration like a new
one. It check the type is compatible with previous declaration. But it does
not merge the previous declaration information.

> doesn't involve any extensions and having it barf on the second memset call
> there is certainly intended behaviour.

That is good to know. I will keep that in mind.

Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-05 18:40     ` Christopher Li
@ 2009-02-05 18:47       ` Derek M Jones
  2009-02-05 20:28         ` Al Viro
  0 siblings, 1 reply; 18+ messages in thread
From: Derek M Jones @ 2009-02-05 18:47 UTC (permalink / raw)
  To: Christopher Li; +Cc: Al Viro, linux-sparse

Christopher,

> Right. I think currently sparse treat the subsequent declaration like a new
> one. It check the type is compatible with previous declaration. But it does
> not merge the previous declaration information.

Sparse needs to generate composite types:
http://c0x.coding-guidelines.com/6.2.7.html

-- 
Derek M. Jones                         tel: +44 (0) 1252 520 667
Knowledge Software Ltd                 mailto:derek@knosof.co.uk
Source code analysis                   http://www.knosof.co.uk

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-03  4:41   ` Al Viro
  2009-02-03  6:28     ` Ralf Wildenhues
@ 2009-02-05 18:52     ` Christopher Li
  1 sibling, 0 replies; 18+ messages in thread
From: Christopher Li @ 2009-02-05 18:52 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-sparse

On Mon, Feb 2, 2009 at 8:41 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> Yuck.   That's a leftover that hadn't been caught since it affects only
> K&R definitions.  Kill that line...

Apply and pushed.

Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-05 18:47       ` Derek M Jones
@ 2009-02-05 20:28         ` Al Viro
  2009-02-05 21:19           ` Al Viro
  2009-02-05 22:41           ` Christopher Li
  0 siblings, 2 replies; 18+ messages in thread
From: Al Viro @ 2009-02-05 20:28 UTC (permalink / raw)
  To: Derek M Jones; +Cc: Christopher Li, linux-sparse

On Thu, Feb 05, 2009 at 06:47:17PM +0000, Derek M Jones wrote:
> Christopher,
>
>> Right. I think currently sparse treat the subsequent declaration like a new
>> one. It check the type is compatible with previous declaration. But it does
>> not merge the previous declaration information.
>
> Sparse needs to generate composite types:
> http://c0x.coding-guidelines.com/6.2.7.html

Of course it does need that (and it's not C0X news, obviously).  We still
have the type handling messed up in a lot of areas, so the plan is to
sort out the declaration parsing, then get rid of the warts in type
representation and handling, then deal with composites.

IMO the right way to look at that crap is:
	* any declaration gives a new struct symbol, with type being the
composite of that given by declaration and that of previously seen one
(which, in turn, has gathered all earlier stuff)
	* type nodes should be treated as expressions, with on-demand
evaluation and referential transparency (i.e. any rewrite replaces with
equal).  We are not that far from such state.
	* composite type is built by unifying two trees, with evaluation
driven by comparisons.
	* we should _NOT_ do update-in-place from e.g. int[] to int[3] -
it's guaranteed to mess e.g. typedef handling, not to mention making the
VLA implementation hard as hell.  We are not referentially transparent
for e.g. struct expression and we'd paid a _lot_ for that.

Note that even check for type compatibility is fscked up for function
types - as it is, we treat void() and void(int) as incompatible types.
So yes, the composite type handling is certainly needed.  We also have
deeply fscked treatment of declarations, mostly thanks to fallout from
trying to be compatible with gcc in attribute handling ;-/

BTW, typedef treatment in declarators is also b0rken: witness sparse barfing on

typedef int T;
extern void f(int);
void g(int x)
{
        int (T);
        T = x;
        f(T);
}

which is a valid C (we have T redeclared in the function scope as int,
with declarator -> direct-declarator -> ( declarator ) -> ( identifier )
as derivation).  sparse mistakes int (T) for typename, does *NOT* notice
that typename has no business whatsoever being there and silently proceeds
to T = ..., without having redeclared T as object of type int.  It sees
typedef-name <something>, decides that it's a beginning of external-definition
and vomits on the following =.

IOW, the rule in direct_declarator() for distinguishing between function
and non-function is broken...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-05 20:28         ` Al Viro
@ 2009-02-05 21:19           ` Al Viro
  2009-02-06  5:36             ` Al Viro
  2009-02-05 22:41           ` Christopher Li
  1 sibling, 1 reply; 18+ messages in thread
From: Al Viro @ 2009-02-05 21:19 UTC (permalink / raw)
  To: Derek M Jones; +Cc: Christopher Li, linux-sparse

On Thu, Feb 05, 2009 at 08:28:11PM +0000, Al Viro wrote:
> typedef int T;
> extern void f(int);
> void g(int x)
> {
>         int (T);
>         T = x;
>         f(T);
> }
> 
> which is a valid C (we have T redeclared in the function scope as int,
> with declarator -> direct-declarator -> ( declarator ) -> ( identifier )
> as derivation).  sparse mistakes int (T) for typename, does *NOT* notice
> that typename has no business whatsoever being there and silently proceeds
> to T = ..., without having redeclared T as object of type int.  It sees
> typedef-name <something>, decides that it's a beginning of external-definition
> and vomits on the following =.
> 
> IOW, the rule in direct_declarator() for distinguishing between function
> and non-function is broken...

PS: note that C grammar has an ambiguity, resolved in constraints (6.7.5.3p11).
We have 3 different cases:
	* typename
	* normal declaration
	* parameter declaration
In the first case, int (T) is "function that takes T and returns int"; we can
have no identifiers in nested abstract-declarator, so there's no problem.
In the second case, int (T) is "declare X as object of type int"; we can't
have parameter-type-list or identifier-list without having seen an identifier.
Again, no problem.  In the third case, though, we can have both
	parameter-declaration -> declaration-specifiers declarator
and
	parameter-declaration -> declaration-specifiers abstract-declarator
with the former going through
	direct-declarator -> ( declarator ) -> ( identifier )
and the latter -
	direct-abstract-declarator ->
	direct-abstract-declarator? ( parameter-type-list) ->
	( parameter-type-list ) -> ( identifier )

It is resolved by "an identifier that can be interpreted either as a typedef
name or as a parameter name shall be taken as a typedef name".

IOW, direct_declarator() (which doubles for direct-abstract-declarator) should
have more than one-bit indication of which case we've got.  Right now it's
done by "have we passed a non-NULL ident ** to store the identifier being
declared"; that's not enough.  What we need is explicit 'is that a part of
parameter declaration' flag; then the rule turns into
	if (p && *p)
		fn = 1; /* we'd seen identifier already, can't be nested */
	else if match_op(next, ')')
		fn = 1; /* empty list can't be direct-declarator or
			 * direct-abstract-declarator */
	else
		fn = (in_parameter && lookup_type(next));

We also need to barf on lack of identifier in definition, unless it
has no storage class specifiers and the type has been struct/union/enum,
straight from the input - not a typedef or typeof resolving to such, but
that's a separate story.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-05 20:28         ` Al Viro
  2009-02-05 21:19           ` Al Viro
@ 2009-02-05 22:41           ` Christopher Li
  2009-02-05 23:22             ` Al Viro
  1 sibling, 1 reply; 18+ messages in thread
From: Christopher Li @ 2009-02-05 22:41 UTC (permalink / raw)
  To: Al Viro; +Cc: Derek M Jones, linux-sparse

On Thu, Feb 5, 2009 at 12:28 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Of course it does need that (and it's not C0X news, obviously).  We still
> have the type handling messed up in a lot of areas, so the plan is to
> sort out the declaration parsing, then get rid of the warts in type
> representation and handling, then deal with composites.


>
> IMO the right way to look at that crap is:
>        * any declaration gives a new struct symbol, with type being the
> composite of that given by declaration and that of previously seen one
> (which, in turn, has gathered all earlier stuff)
>        * type nodes should be treated as expressions, with on-demand
> evaluation and referential transparency (i.e. any rewrite replaces with
> equal).  We are not that far from such state.

Can you elaborate the referential transparency? I currently have a vague
idea of what you are trying to do. Where should I start if I am going
to turn the idea into code. Should we delay the detail of type
information until evaluate stage?


Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-05 22:41           ` Christopher Li
@ 2009-02-05 23:22             ` Al Viro
  0 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2009-02-05 23:22 UTC (permalink / raw)
  To: Christopher Li; +Cc: Derek M Jones, linux-sparse

On Thu, Feb 05, 2009 at 02:41:00PM -0800, Christopher Li wrote:

> Can you elaborate the referential transparency? I currently have a vague
> idea of what you are trying to do. Where should I start if I am going
> to turn the idea into code. Should we delay the detail of type
> information until evaluate stage?

Basically, "replace equal with equal".  I.e. whatever we do with struct
symbol that might be visible to somebody, we should not change the
interpretation of type for *any* struct symbol, including ones that
might refer to that one.

Take a look at uninlining for a nice example of the fallout you get
when that property doesn't hold.  We cannibalize struct expression
during evaluation a _lot_, so we have to be bloody careful about what
we share and what we must copy.

For struct expression we do that in a lot of places and it's probably
worth to keep doing due to memory footprint concerns.  For types
we can avoid that.

As for the laziness - yes, that's what classify_type() et.al. is about.
We want to do exactly as little as possible; basically, *all* type
node evaluation (aka examining) should be demand-driven.  It mostly
holds, but we definitely could be lazier, e.g. in qualifier-related
stuff.

I'll do a more complete braindump (ideally - along with the declaration-parsing
patchset, so that this work could really start) once I dig myself from
under the !@#!@#!@# patch and mail piles elsewhere a bit.

			Al, terminally annoyed by the size of backlog ;-/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-05 21:19           ` Al Viro
@ 2009-02-06  5:36             ` Al Viro
  2009-02-09  7:52               ` Christopher Li
  0 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2009-02-06  5:36 UTC (permalink / raw)
  To: Derek M Jones; +Cc: Christopher Li, linux-sparse

On Thu, Feb 05, 2009 at 09:19:21PM +0000, Al Viro wrote:
> IOW, direct_declarator() (which doubles for direct-abstract-declarator) should
> have more than one-bit indication of which case we've got.  Right now it's
> done by "have we passed a non-NULL ident ** to store the identifier being
> declared"; that's not enough.  What we need is explicit 'is that a part of
> parameter declaration' flag; then the rule turns into
> 	if (p && *p)
> 		fn = 1; /* we'd seen identifier already, can't be nested */
> 	else if match_op(next, ')')
> 		fn = 1; /* empty list can't be direct-declarator or
> 			 * direct-abstract-declarator */
> 	else
> 		fn = (in_parameter && lookup_type(next));

Umm...  It's a bit more subtle (p goes NULL after the nested one is
handled), so we need to keep track of "don't allow nested declarator from
that point on" explicitly.  Patch follows:

Subject: [PATCH] Handle nested declarators vs. parameter lists correctly

Seeing a typedef name after ( means that we have a parameter-type-list
only if we are parsing a parameter declaration or a typename; otherwise
it might very well be a redeclaration (e.g. int (T); when T had been a
typedef in outer scope).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 parse.c                        |   26 +++++++++++++++-----------
 validation/nested-declarator.c |   11 +++++++++++
 2 files changed, 26 insertions(+), 11 deletions(-)
 create mode 100644 validation/nested-declarator.c

diff --git a/parse.c b/parse.c
index 7e3191a..73e7b65 100644
--- a/parse.c
+++ b/parse.c
@@ -1227,7 +1227,7 @@ static struct token *abstract_array_declarator(struct token *token, struct symbo
 }
 
 static struct token *parameter_type_list(struct token *, struct symbol *, struct ident **p);
-static struct token *declarator(struct token *token, struct symbol *sym, struct ident **p);
+static struct token *declarator(struct token *token, struct symbol *sym, struct ident **p, int);
 
 static struct token *handle_attributes(struct token *token, struct ctype *ctype, unsigned int keywords)
 {
@@ -1247,13 +1247,15 @@ static struct token *handle_attributes(struct token *token, struct ctype *ctype,
 	return token;
 }
 
-static struct token *direct_declarator(struct token *token, struct symbol *decl, struct ident **p)
+static struct token *direct_declarator(struct token *token, struct symbol *decl, struct ident **p, int prefer_abstract)
 {
 	struct ctype *ctype = &decl->ctype;
+	int dont_nest = 0;
 
 	if (p && token_type(token) == TOKEN_IDENT) {
 		*p = token->ident;
 		token = token->next;
+		dont_nest = 1;
 	}
 
 	for (;;) {
@@ -1275,14 +1277,16 @@ static struct token *direct_declarator(struct token *token, struct symbol *decl,
 			int fn;
 
 			next = handle_attributes(next, ctype, KW_ATTRIBUTE);
-			fn = (p && *p) || match_op(next, ')') || lookup_type(next);
+			fn = dont_nest || match_op(next, ')') ||
+				(prefer_abstract && lookup_type(next));
 
 			if (!fn) {
 				struct symbol *base_type = ctype->base_type;
-				token = declarator(next, decl, p);
+				token = declarator(next, decl, p, prefer_abstract);
 				token = expect(token, ')', "in nested declarator");
 				while (ctype->base_type != base_type)
 					ctype = &ctype->base_type->ctype;
+				dont_nest = 1;
 				p = NULL;
 				continue;
 			}
@@ -1336,10 +1340,10 @@ static struct token *pointer(struct token *token, struct ctype *ctype)
 	return token;
 }
 
-static struct token *declarator(struct token *token, struct symbol *sym, struct ident **p)
+static struct token *declarator(struct token *token, struct symbol *sym, struct ident **p, int prefer_abstract)
 {
 	token = pointer(token, &sym->ctype);
-	return direct_declarator(token, sym, p);
+	return direct_declarator(token, sym, p, prefer_abstract);
 }
 
 static struct token *handle_bitfield(struct token *token, struct symbol *decl)
@@ -1399,7 +1403,7 @@ static struct token *declaration_list(struct token *token, struct symbol_list **
 		struct ident *ident = NULL;
 		struct symbol *decl = alloc_symbol(token->pos, SYM_NODE);
 		decl->ctype = ctype;
-		token = declarator(token, decl, &ident);
+		token = declarator(token, decl, &ident, 0);
 		decl->ident = ident;
 		if (match_op(token, ':')) {
 			token = handle_bitfield(token, decl);
@@ -1439,7 +1443,7 @@ static struct token *parameter_declaration(struct token *token, struct symbol **
 	sym = alloc_symbol(token->pos, SYM_NODE);
 	sym->ctype = ctype;
 	*tree = sym;
-	token = declarator(token, sym, &ident);
+	token = declarator(token, sym, &ident, 1);
 	sym->ident = ident;
 	apply_modifiers(token->pos, &sym->ctype);
 	sym->endpos = token->pos;
@@ -1451,7 +1455,7 @@ struct token *typename(struct token *token, struct symbol **p, int mod)
 	struct symbol *sym = alloc_symbol(token->pos, SYM_NODE);
 	*p = sym;
 	token = declaration_specifiers(token, &sym->ctype, 0);
-	token = declarator(token, sym, NULL);
+	token = declarator(token, sym, NULL, 1);
 	apply_modifiers(token->pos, &sym->ctype);
 	if (sym->ctype.modifiers & MOD_STORAGE & ~mod)
 		warning(sym->pos, "storage class in typename (%s)",
@@ -2261,7 +2265,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 	token = declaration_specifiers(token, &ctype, 0);
 	decl = alloc_symbol(token->pos, SYM_NODE);
 	decl->ctype = ctype;
-	token = declarator(token, decl, &ident);
+	token = declarator(token, decl, &ident, 0);
 	apply_modifiers(token->pos, &decl->ctype);
 
 	decl->endpos = token->pos;
@@ -2333,7 +2337,7 @@ struct token *external_declaration(struct token *token, struct symbol_list **lis
 		decl = alloc_symbol(token->pos, SYM_NODE);
 		decl->ctype = ctype;
 		token = declaration_specifiers(token, &decl->ctype, 1);
-		token = declarator(token, decl, &ident);
+		token = declarator(token, decl, &ident, 0);
 		apply_modifiers(token->pos, &decl->ctype);
 		decl->endpos = token->pos;
 		if (!ident) {
diff --git a/validation/nested-declarator.c b/validation/nested-declarator.c
new file mode 100644
index 0000000..24ed833
--- /dev/null
+++ b/validation/nested-declarator.c
@@ -0,0 +1,11 @@
+typedef int T;
+extern void f(int);
+static void g(int x)
+{
+	int (T);
+	T = x;
+	f(T);
+}
+/*
+ * check-name: nested declarator vs. parameters
+ */
-- 
1.5.6.6


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-06  5:36             ` Al Viro
@ 2009-02-09  7:52               ` Christopher Li
  2009-02-09  8:54                 ` Al Viro
  0 siblings, 1 reply; 18+ messages in thread
From: Christopher Li @ 2009-02-09  7:52 UTC (permalink / raw)
  To: Al Viro; +Cc: Derek M Jones, linux-sparse

On Thu, Feb 5, 2009 at 9:36 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Thu, Feb 05, 2009 at 09:19:21PM +0000, Al Viro wrote:
>> IOW, direct_declarator() (which doubles for direct-abstract-declarator) should
>> have more than one-bit indication of which case we've got.  Right now it's
>> done by "have we passed a non-NULL ident ** to store the identifier being
>> declared"; that's not enough.  What we need is explicit 'is that a part of
>> parameter declaration' flag; then the rule turns into
>>       if (p && *p)
>>               fn = 1; /* we'd seen identifier already, can't be nested */
>>       else if match_op(next, ')')
>>               fn = 1; /* empty list can't be direct-declarator or
>>                        * direct-abstract-declarator */
>>       else
>>               fn = (in_parameter && lookup_type(next));
>
> Umm...  It's a bit more subtle (p goes NULL after the nested one is
> handled), so we need to keep track of "don't allow nested declarator from
> that point on" explicitly.  Patch follows:
>
> Subject: [PATCH] Handle nested declarators vs. parameter lists correctly
>
> Seeing a typedef name after ( means that we have a parameter-type-list
> only if we are parsing a parameter declaration or a typename; otherwise
> it might very well be a redeclaration (e.g. int (T); when T had been a
> typedef in outer scope).

The patch looks great. Applied.

Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: fun with declarations and definitions
  2009-02-09  7:52               ` Christopher Li
@ 2009-02-09  8:54                 ` Al Viro
  0 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2009-02-09  8:54 UTC (permalink / raw)
  To: Christopher Li; +Cc: Derek M Jones, linux-sparse

On Sun, Feb 08, 2009 at 11:52:23PM -0800, Christopher Li wrote:
> > Subject: [PATCH] Handle nested declarators vs. parameter lists correctly
> >
> > Seeing a typedef name after ( means that we have a parameter-type-list
> > only if we are parsing a parameter declaration or a typename; otherwise
> > it might very well be a redeclaration (e.g. int (T); when T had been a
> > typedef in outer scope).
> 
> The patch looks great. Applied.

Still leaves a couple of uglies, BTW.  One (easily fixed):

int [2]*p[3];

is actually (still) accepted.  Nevermind, we just need to set dont_nest
in a few more places (and no, it's not a regression; moreover, similar
crap works with functions instead of arrays).

Nastier one:
#define A __attribute__((address_space(1)))
void (*f)(A int *x, A int *y) = (void *)0;
void g(int A *x)
{
        f(x, x);
}

now try test-parsing on it.  You'll see that f gets type
void (A *)(int *, int A *);
instead of expected
void (*)(int A *, int A *);

Reason: cretinous gcc grammar is fscked in head.  There's a very, _very_
good reason why C doesn't allow e.g.
	int (const x);
Consider seeing
	void f(int (const
What would that "(const" be?  Beginning of parameter list in the first
parameter of f or a beginning of nested declarator in the same?  In
C it's the former and we can tell that immediately.

I really don't want to get into the reasons why gcc decided to allow that with
s/const/__attribute__((....))/ - it's too long a flame, but they did and
so we have to do lookahead from hell in parser.  Namely, eat all attributes
first and only then decide whether this ( starts a nested declarator or a
parameter list.

... except that we do it wrong and these attributes land on what will by
SYM_FN and not its first parameter.

BTW, the longer I'm looking at that, the less I like the original decision
to use __attribute__ for our extensions.  Or continued use of __attribute__
by gcc folks, while we are at it.

One lovely example:
	T __attribute__((whatever)) *p;
is NOT pointer-to-whatever-T; it's whatever-pointer-to-T.  Which is clearly
not what we want for __user et.al.  The only way to get it applied *before*
derivation is, BTW,
	T (__attribute__((whatever)) *p);
(and yes, that's where this crap in syntax has come from).  Mind you,
	T * __attribute__((whatever)) *p;
*is* pointer-to-whatever-pointer-to-T, which makes the situation even more
inconsistent.

While we are at it, our handling of __attribute__((mode(...))) is wrong -
not only
__attribute__((mode(HI)) long x;
is fine, it's actually s16.  Moreover,
int __attribute__((mode(HI)) *x;
is only going to work on 16bit boxen, since it's "16bit pointer to int".
64/32bit analogs of that construct *are* used in the wild on some weird
targets.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-02-09  8:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-02  7:30 fun with declarations and definitions Al Viro
2009-02-02 20:17 ` Christopher Li
2009-02-02 20:58   ` Al Viro
2009-02-02 22:25     ` Christopher Li
2009-02-03  3:07 ` Christopher Li
2009-02-03  4:13   ` Al Viro
2009-02-05 18:40     ` Christopher Li
2009-02-05 18:47       ` Derek M Jones
2009-02-05 20:28         ` Al Viro
2009-02-05 21:19           ` Al Viro
2009-02-06  5:36             ` Al Viro
2009-02-09  7:52               ` Christopher Li
2009-02-09  8:54                 ` Al Viro
2009-02-05 22:41           ` Christopher Li
2009-02-05 23:22             ` Al Viro
2009-02-03  4:41   ` Al Viro
2009-02-03  6:28     ` Ralf Wildenhues
2009-02-05 18:52     ` Christopher Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).